aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-01-22 23:21:18 -0600
committerCraig Jennings <c@cjennings.net>2026-01-22 23:21:18 -0600
commit0ffe7a85a1b024b88e4ddc3305c5f805edd6e8e1 (patch)
treeccd6c610630cce9eef268ab692999cdfe3bb5a1b /docs
parent197a8036af21232276cfbd9624d9eeeebe722df6 (diff)
downloadarchangel-0ffe7a85a1b024b88e4ddc3305c5f805edd6e8e1.tar.gz
archangel-0ffe7a85a1b024b88e4ddc3305c5f805edd6e8e1.zip
Replace GRUB with ZFSBootMenu bootloader
This is a major change that replaces the GRUB bootloader with ZFSBootMenu, providing native ZFS boot environment support. Key changes: - EFI partition reduced from 1GB to 512MB (only holds ZFSBootMenu) - EFI now mounts at /efi instead of /boot - Kernel and initramfs live on ZFS root (enables snapshot boot with matching kernel) - Downloads pre-built ZFSBootMenu EFI binary from get.zfsbootmenu.org - Creates EFI boot entries for all disks in multi-disk configurations - Syncs ZFSBootMenu to all EFI partitions for redundancy - Sets org.zfsbootmenu:commandline on zroot/ROOT for kernel cmdline inheritance - Sets bootfs pool property for default boot environment - AMD GPU workarounds (pg_mask, cwsr_enable) added to kernel cmdline when AMD detected Deleted GRUB snapshot tooling (no longer needed): - custom/grub-zfs-snap - custom/40_zfs_snapshots - custom/zz-grub-zfs-snap.hook - custom/zfs-snap-prune Updated helper scripts: - zfssnapshot: removed grub-zfs-snap call, shows ZFSBootMenu tip - zfsrollback: removed grub-zfs-snap call, notes auto-detection Tested configurations: - Single disk installation - 2-disk mirror (mirror-0) - 3-disk RAIDZ1 (raidz1-0) - All boot correctly with ZFSBootMenu
Diffstat (limited to 'docs')
-rw-r--r--docs/2026-01-22-ratio-amd-gpu-freeze-fix-instructions.org224
-rw-r--r--docs/research-sandreas-zarch.org365
-rw-r--r--docs/session-context.org52
3 files changed, 641 insertions, 0 deletions
diff --git a/docs/2026-01-22-ratio-amd-gpu-freeze-fix-instructions.org b/docs/2026-01-22-ratio-amd-gpu-freeze-fix-instructions.org
new file mode 100644
index 0000000..d6b8461
--- /dev/null
+++ b/docs/2026-01-22-ratio-amd-gpu-freeze-fix-instructions.org
@@ -0,0 +1,224 @@
+AMD Strix Halo VPE/CWSR Freeze Fix Instructions
+===============================================
+Created: 2026-01-22
+Machine: ratio (Framework Desktop, AMD Ryzen AI Max 300)
+
+PROBLEM SUMMARY
+---------------
+Two AMD GPU bugs cause random freezes on Strix Halo:
+
+1. VPE Power Gating Bug
+ - VPE (Video Processing Engine) tries to power gate after 1 second idle
+ - SMU hangs, system freezes
+ - Fix: amdgpu.pg_mask=0 (disables power gating)
+
+2. CWSR Bug (Compute Wavefront Save/Restore)
+ - MES firmware hang under compute loads
+ - Causes GPU reset loops and crashes
+ - Fix: amdgpu.cwsr_enable=0
+
+Current state on ratio:
+- pg_mask = 4294967295 (power gating ENABLED - bad)
+- cwsr_enable = 1 (CWSR ENABLED - bad)
+- Neither workaround is applied
+
+
+PART 1: GRUB CMDLINE FIX (Quick, can do now)
+============================================
+This adds the parameters to the kernel command line via GRUB.
+Can be done on the running system, takes effect on next boot.
+
+Step 1: Edit GRUB defaults
+--------------------------
+sudo nano /etc/default/grub
+
+Find the line:
+GRUB_CMDLINE_LINUX_DEFAULT="..."
+
+Add these parameters (keep existing ones):
+GRUB_CMDLINE_LINUX_DEFAULT="... amdgpu.pg_mask=0 amdgpu.cwsr_enable=0"
+
+Example - if current line is:
+GRUB_CMDLINE_LINUX_DEFAULT="loglevel=2 rd.systemd.show_status=auto rd.udev.log_level=2 nvme.noacpi=1 mem_sleep_default=deep nowatchdog random.trust_cpu=off quiet splash"
+
+Change to:
+GRUB_CMDLINE_LINUX_DEFAULT="loglevel=2 rd.systemd.show_status=auto rd.udev.log_level=2 nvme.noacpi=1 mem_sleep_default=deep nowatchdog random.trust_cpu=off quiet splash amdgpu.pg_mask=0 amdgpu.cwsr_enable=0"
+
+Step 2: Regenerate GRUB config
+------------------------------
+sudo grub-mkconfig -o /boot/grub/grub.cfg
+
+Step 3: Reboot and verify
+-------------------------
+sudo reboot
+
+After reboot, verify:
+cat /sys/module/amdgpu/parameters/pg_mask
+# Should show: 0
+
+cat /sys/module/amdgpu/parameters/cwsr_enable
+# Should show: 0
+
+cat /proc/cmdline | grep -oE "(pg_mask|cwsr_enable)=[^ ]+"
+# Should show:
+# pg_mask=0
+# cwsr_enable=0
+
+
+PART 2: MODPROBE.D FIX (Permanent, requires live ISO)
+=====================================================
+This embeds the parameters in the initramfs so they're always applied.
+MUST be done from live ISO because mkinitcpio triggers the freeze.
+
+Step 1: Boot archzfs live ISO
+-----------------------------
+- Boot from USB with archzfs ISO
+- Get to root shell
+
+Step 2: Import and mount ZFS
+----------------------------
+zpool import -f zroot
+zfs mount zroot/ROOT/default
+mount /dev/nvme1n1p1 /mnt/boot # Note: nvme1n1p1, not nvme0n1p1!
+
+Verify:
+ls /mnt/boot/vmlinuz*
+# Should show kernel images
+
+Step 3: Create modprobe config
+------------------------------
+cat > /mnt/etc/modprobe.d/amdgpu.conf << 'EOF'
+# Workarounds for AMD Strix Halo GPU bugs
+# Created: 2026-01-22
+# Remove when kernel has proper fixes (check linux-lts >= 6.18 with fixes)
+
+# Disable power gating to prevent VPE freeze
+# VPE tries to power gate after 1s idle, causing SMU hang
+options amdgpu pg_mask=0
+
+# Disable Compute Wavefront Save/Restore to prevent MES hang
+# CWSR causes MES firmware 0x80 hang under compute loads
+options amdgpu cwsr_enable=0
+EOF
+
+Step 4: Chroot and rebuild initramfs
+------------------------------------
+# Mount system directories
+mount --rbind /dev /mnt/dev
+mount --rbind /sys /mnt/sys
+mount --rbind /proc /mnt/proc
+mount --rbind /run /mnt/run
+
+# Chroot
+arch-chroot /mnt
+
+# Rebuild initramfs (this is safe from live ISO)
+mkinitcpio -P
+
+# Verify amdgpu.conf is in initramfs
+lsinitcpio /boot/initramfs-linux.img | grep amdgpu
+# Should show: etc/modprobe.d/amdgpu.conf
+
+# Exit chroot
+exit
+
+Step 5: Clean up and reboot
+---------------------------
+# Unmount everything
+umount -R /mnt/dev /mnt/sys /mnt/proc /mnt/run
+zfs unmount -a
+zpool export zroot
+
+# Reboot
+reboot
+
+Step 6: Verify after reboot
+---------------------------
+cat /sys/module/amdgpu/parameters/pg_mask
+# Should show: 0
+
+cat /sys/module/amdgpu/parameters/cwsr_enable
+# Should show: 0
+
+lsinitcpio /boot/initramfs-linux.img | grep amdgpu.conf
+# Should show: etc/modprobe.d/amdgpu.conf
+
+
+VERIFICATION CHECKLIST
+======================
+After applying fixes, verify:
+
+[ ] pg_mask shows 0 (not 4294967295)
+[ ] cwsr_enable shows 0 (not 1)
+[ ] Parameters visible in /proc/cmdline (if using GRUB method)
+[ ] amdgpu.conf in initramfs (if using modprobe.d method)
+[ ] System stable - no freezes during idle
+[ ] mkinitcpio -P completes without freeze (test after fix applied)
+
+
+IMPORTANT NOTES
+===============
+
+1. Boot partition UUID
+ ratio has mirrored NVMe drives. The boot partition is on nvme1n1p1:
+ - nvme0n1p1: 6A4B-47A4 (NOT the boot partition)
+ - nvme1n1p1: 6A4A-93B1 (THIS is /boot)
+
+2. Kernel is pinned
+ /etc/pacman.conf has: IgnorePkg = linux
+ This prevents upgrading from 6.15.2 until manually unpinned.
+ DO NOT upgrade to 6.18.x - it has worse bugs for Strix Halo.
+
+3. When to remove workarounds
+ Monitor Framework Community and AMD-gfx mailing list for proper fixes.
+ When linux-lts has confirmed VPE and CWSR fixes, can try removing.
+ Test by commenting out lines in amdgpu.conf, rebuild initramfs, test.
+
+4. If system freezes during mkinitcpio
+ This means the fix isn't active yet. Must do from live ISO.
+ The modconf hook reads /etc/modprobe.d/ at build time, but the
+ running kernel still has the old parameters until reboot.
+
+
+TROUBLESHOOTING
+===============
+
+System still freezes after GRUB fix:
+- Check /proc/cmdline - are parameters there?
+- Check /sys/module/amdgpu/parameters/* - are values correct?
+- If cmdline has them but sysfs doesn't, driver may have loaded before
+ parsing. Need modprobe.d method instead.
+
+Can't import zpool from live ISO:
+- Try: zpool import -f zroot
+- If "pool was previously in use": zpool import -f zroot
+- Check hostid: cat /etc/hostid on installed system
+
+mkinitcpio says "Preset not found":
+- Check /etc/mkinitcpio.d/*.preset files exist
+- For linux kernel: linux.preset
+- For linux-lts: linux-lts.preset
+
+After chroot, wrong mountpoints:
+- Reset mountpoints after any chroot work:
+ zfs set mountpoint=/ zroot/ROOT/default
+ zfs set mountpoint=/home zroot/home
+ (etc. for all datasets)
+
+
+REFERENCES
+==========
+
+VPE timeout patch (not merged):
+https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg127724.html
+
+Framework Community - critical 6.18 bugs:
+https://community.frame.work/t/attn-critical-bugs-in-amdgpu-driver-included-with-kernel-6-18-x-6-19-x/79221
+
+CWSR workaround:
+https://github.com/ROCm/ROCm/issues/5590
+
+Session documentation:
+- docs/2026-01-22-ratio-boot-fix-session.org
+- docs/2026-01-22-mkinitcpio-config-boot-failure.org
+- assets/2026-01-22-mkinitcpio-freeze-during-rebuild.org
diff --git a/docs/research-sandreas-zarch.org b/docs/research-sandreas-zarch.org
new file mode 100644
index 0000000..55bc77b
--- /dev/null
+++ b/docs/research-sandreas-zarch.org
@@ -0,0 +1,365 @@
+#+TITLE: Research: sandreas/zarch ZFSBootMenu Installation
+#+DATE: 2026-01-22
+#+AUTHOR: Research Notes
+
+* Overview
+
+This document summarizes research on the [[https://github.com/sandreas/zarch][sandreas/zarch]] GitHub repository for
+Arch Linux ZFS installation. The project uses ZFSBootMenu, native encryption,
+and automatic snapshots via zrepl.
+
+* Project Philosophy
+
+sandreas/zarch is described as a "single, non-modular file with some minor
+config profiles" - the author explicitly avoids a "modular multi-script beast."
+This contrasts with our more modular approach but offers useful patterns.
+
+** Key Features
+- ZFSBootMenu as bootloader (not GRUB)
+- Native ZFS encryption (AES-256-GCM)
+- Automatic snapshots via zrepl
+- EFI-only (no BIOS support)
+- Profile-based configuration
+
+* ZFSBootMenu Installation
+
+** Download and Install
+#+begin_src bash
+# Create EFI directory
+mkdir -p /efi/EFI/ZBM
+
+# Download latest ZFSBootMenu EFI binary
+wget -c https://get.zfsbootmenu.org/latest.EFI -O /efi/EFI/ZBM/ZFSBOOTMENU.EFI
+
+# Or use curl variant
+curl -o /boot/efi/EFI/ZBM/VMLINUZ.EFI -L https://get.zfsbootmenu.org/efi
+#+end_src
+
+** EFI Boot Entry Registration
+#+begin_src bash
+efibootmgr --disk $DISK --part 1 \
+ --create \
+ --label "ZFSBootMenu" \
+ --loader '\EFI\ZBM\ZFSBOOTMENU.EFI' \
+ --unicode "spl_hostid=$(hostid) zbm.timeout=3 zbm.prefer=zroot zbm.import_policy=hostid" \
+ --verbose
+#+end_src
+
+** Key ZFSBootMenu Parameters
+| Parameter | Purpose |
+|------------------------+------------------------------------------------|
+| zbm.timeout=N | Seconds to wait before auto-booting default |
+| zbm.prefer=POOL | Preferred pool for default boot environment |
+| zbm.import_policy | Pool import strategy (hostid recommended) |
+| zbm.skip | Skip menu and boot default immediately |
+| zbm.show | Force menu display |
+| spl_hostid=0xXXXXXXXX | Host ID for pool import validation |
+
+** Kernel Command Line for Boot Environments
+#+begin_src bash
+# Set inherited command line on ROOT dataset
+zfs set org.zfsbootmenu:commandline="quiet loglevel=0" zroot/ROOT
+
+# Set pool bootfs property
+zpool set bootfs=zroot/ROOT/arch zroot
+#+end_src
+
+* Dataset Layout
+
+** zarch Dataset Structure
+#+begin_example
+$POOL mountpoint=none
+$POOL/ROOT mountpoint=none (container for boot environments)
+$POOL/ROOT/arch mountpoint=/, canmount=noauto (active root)
+$POOL/home mountpoint=/home (shared across boot environments)
+#+end_example
+
+** Comparison: Our archzfs Dataset Structure
+#+begin_example
+zroot mountpoint=none, canmount=off
+zroot/ROOT mountpoint=none, canmount=off
+zroot/ROOT/default mountpoint=/, canmount=noauto, reservation=5-20G
+zroot/home mountpoint=/home
+zroot/home/root mountpoint=/root
+zroot/media mountpoint=/media, compression=off
+zroot/vms mountpoint=/vms, recordsize=64K
+zroot/var mountpoint=/var, canmount=off
+zroot/var/log mountpoint=/var/log
+zroot/var/cache mountpoint=/var/cache
+zroot/var/lib mountpoint=/var/lib, canmount=off
+zroot/var/lib/pacman mountpoint=/var/lib/pacman
+zroot/var/lib/docker mountpoint=/var/lib/docker
+zroot/var/tmp mountpoint=/var/tmp, auto-snapshot=false
+zroot/tmp mountpoint=/tmp, auto-snapshot=false
+#+end_example
+
+** Key Differences
+- zarch: Minimal dataset layout (ROOT, home)
+- archzfs: Fine-grained datasets with workload-specific tuning
+- archzfs: Separate /var/log, /var/cache, /var/lib/docker
+- archzfs: recordsize=64K for VM storage
+- archzfs: compression=off for media (already compressed)
+
+* ZFS Pool Creation
+
+** zarch Pool Creation (with encryption)
+#+begin_src bash
+zpool create -f \
+ -o ashift=12 \
+ -O compression=lz4 \
+ -O acltype=posixacl \
+ -O xattr=sa \
+ -O relatime=off \
+ -O atime=off \
+ -O encryption=aes-256-gcm \
+ -O keylocation=prompt \
+ -O keyformat=passphrase \
+ -o autotrim=on \
+ -m none \
+ $POOL ${DISK}-part2
+#+end_src
+
+** Our archzfs Pool Creation (with encryption)
+#+begin_src bash
+zpool create -f \
+ -o ashift="$ASHIFT" \
+ -o autotrim=on \
+ -O acltype=posixacl \
+ -O atime=off \
+ -O canmount=off \
+ -O compression="$COMPRESSION" \
+ -O dnodesize=auto \
+ -O normalization=formD \
+ -O relatime=on \
+ -O xattr=sa \
+ -O encryption=aes-256-gcm \
+ -O keyformat=passphrase \
+ -O keylocation=prompt \
+ -O mountpoint=none \
+ -R /mnt \
+ "$POOL_NAME" $pool_config
+#+end_src
+
+** Key Differences
+| Option | zarch | archzfs | Notes |
+|-----------------+-------------------+-----------------------+---------------------------------|
+| compression | lz4 | zstd (configurable) | zstd better ratio, more CPU |
+| atime | off | off | Same |
+| relatime | off | on | archzfs uses relatime instead |
+| dnodesize | (default) | auto | Better extended attribute perf |
+| normalization | (default) | formD | Unicode consistency |
+
+* Snapshot Automation
+
+** zarch: zrepl Configuration
+
+zarch uses zrepl for automated snapshots with this retention grid:
+
+#+begin_example
+1x1h(keep=4) | 24x1h(keep=1) | 7x1d(keep=1) | 4x1w(keep=1) | 12x4w(keep=1) | 1x53w(keep=1)
+#+end_example
+
+This means:
+- Keep 4 snapshots within the last hour
+- Keep 1 snapshot per hour for 24 hours
+- Keep 1 snapshot per day for 7 days
+- Keep 1 snapshot per week for 4 weeks
+- Keep 1 snapshot per 4 weeks for 12 periods (48 weeks)
+- Keep 1 snapshot per year
+
+#+begin_src yaml
+# Example zrepl.yml structure
+jobs:
+ - name: snapjob
+ type: snap
+ filesystems:
+ "zroot<": true
+ snapshotting:
+ type: periodic
+ interval: 15m
+ prefix: zrepl_
+ pruning:
+ keep:
+ - type: grid
+ grid: 1x1h(keep=all) | 24x1h | 14x1d
+ regex: "^zrepl_.*"
+ - type: regex
+ negate: true
+ regex: "^zrepl_.*"
+#+end_src
+
+** archzfs: Pacman Hook Approach
+
+Our approach uses pre-transaction snapshots:
+#+begin_src bash
+# /etc/pacman.d/hooks/zfs-snapshot.hook
+[Trigger]
+Operation = Upgrade
+Operation = Install
+Operation = Remove
+Type = Package
+Target = *
+
+[Action]
+Description = Creating ZFS snapshot before pacman transaction...
+When = PreTransaction
+Exec = /usr/local/bin/zfs-pre-snapshot
+#+end_src
+
+** Comparison: Snapshot Approaches
+| Feature | zrepl (zarch) | Pacman Hook (archzfs) |
+|-------------------+--------------------------+------------------------------|
+| Trigger | Time-based (15 min) | Event-based (pacman) |
+| Retention | Complex grid policy | Manual or sanoid |
+| Granularity | High (frequent) | Package transaction focused |
+| Recovery Point | ~15 minutes | Last package operation |
+| Storage overhead | Higher (more snapshots) | Lower (fewer snapshots) |
+
+** Alternative: sanoid (mentioned in archzfs)
+Sanoid provides similar functionality to zrepl with simpler configuration:
+#+begin_src ini
+# /etc/sanoid/sanoid.conf
+[zroot/ROOT/default]
+use_template = production
+recursive = yes
+
+[template_production]
+frequently = 0
+hourly = 24
+daily = 7
+weekly = 4
+monthly = 12
+yearly = 1
+autosnap = yes
+autoprune = yes
+#+end_src
+
+* EFI and Boot Partition Strategy
+
+** zarch: 512MB EFI, ZFSBootMenu
+- Single 512MB EFI partition (type EF00)
+- ZFSBootMenu EFI binary downloaded from upstream
+- No GRUB, no separate boot partition on ZFS
+- Kernel/initramfs stored on ZFS root (ZFSBootMenu reads them)
+
+** archzfs: 1GB EFI, GRUB with ZFS Support
+- 1GB EFI partition per disk
+- GRUB with ZFS module for pool access
+- Redundant EFI partitions synced via rsync
+- Boot files in EFI partition (not ZFS)
+
+** Trade-offs
+
+| Aspect | ZFSBootMenu | GRUB + ZFS |
+|---------------------+--------------------------------+------------------------------|
+| Boot environment | Native (designed for ZFS) | Requires ZFS module |
+| Snapshot booting | Built-in, interactive | Custom GRUB menu entries |
+| Encryption | Prompts for key automatically | More complex setup |
+| EFI space needed | Minimal (~512MB) | Larger (kernel/initramfs) |
+| Complexity | Simpler (single binary) | More moving parts |
+| Recovery | Can browse/rollback at boot | Requires grub.cfg regen |
+
+* Pacman Hooks and Systemd Services
+
+** zarch Services
+#+begin_example
+zfs-import-cache
+zfs-import.target
+zfs-mount
+zfs-zed
+zfs.target
+set-locale-once.service (custom first-boot locale config)
+#+end_example
+
+** archzfs Services
+#+begin_example
+zfs.target
+zfs-import-scan.service (instead of cache-based)
+zfs-mount.service
+zfs-import.target
+NetworkManager
+avahi-daemon
+sshd
+#+end_example
+
+** Key Difference: Import Method
+- zarch: Uses zfs-import-cache (requires cachefile)
+- archzfs: Uses zfs-import-scan (scans with blkid, no cachefile needed)
+
+The scan method is simpler and more portable (works if moving disks between
+systems).
+
+* mkinitcpio Configuration
+
+** zarch Approach
+#+begin_src bash
+sed -i '/^HOOKS=/s/block filesystems/block zfs filesystems/g' /etc/mkinitcpio.conf
+#+end_src
+
+** archzfs Approach
+#+begin_src bash
+HOOKS=(base udev microcode modconf kms keyboard keymap consolefont block zfs filesystems)
+#+end_src
+
+** Important Notes
+- Both use busybox-based udev (not systemd hook)
+- archzfs explicitly removes autodetect to ensure all storage drivers included
+- archzfs removes fsck (ZFS doesn't use it)
+- archzfs includes microcode early loading
+
+* Useful Patterns to Consider
+
+** 1. Profile-Based Configuration
+zarch uses a profile directory system:
+#+begin_example
+default/
+ archpkg.txt # Official packages
+ aurpkg.txt # AUR packages
+ services.txt # Services to enable
+ zarch.conf # Core configuration
+ custom-chroot.sh # Custom post-install
+#+end_example
+
+This allows maintaining multiple configurations (desktop, server, VM) cleanly.
+
+** 2. ZFSBootMenu for Simpler Boot
+For future consideration:
+- Native ZFS boot environment support
+- Interactive snapshot selection at boot
+- Simpler encryption key handling
+- Smaller EFI partition needs
+
+** 3. zrepl for Time-Based Snapshots
+For systems needing frequent snapshots beyond pacman transactions:
+- 15-minute intervals for development machines
+- Complex retention policies
+- Replication to remote systems
+
+** 4. AUR Helper Installation Pattern
+#+begin_src bash
+# Build yay as regular user, install as root
+su -c "git clone https://aur.archlinux.org/yay-bin.git" "$USER_NAME"
+arch-chroot -u "$USER_NAME" /mnt makepkg -D /home/$USER_NAME/yay-bin -s
+pacman -U --noconfirm yay-bin-*.pkg.tar.*
+#+end_src
+
+* References
+
+- [[https://github.com/sandreas/zarch][sandreas/zarch GitHub Repository]]
+- [[https://zfsbootmenu.org/][ZFSBootMenu Official Site]]
+- [[https://docs.zfsbootmenu.org/en/latest/][ZFSBootMenu Documentation]]
+- [[https://zrepl.github.io/][zrepl Documentation]]
+- [[https://wiki.archlinux.org/title/ZFS][Arch Wiki: ZFS]]
+- [[https://github.com/acrion/zfs-autosnap][zfs-autosnap - Pre-upgrade Snapshots]]
+- [[https://aur.archlinux.org/packages/pacman-zfs-hook][pacman-zfs-hook AUR Package]]
+- [[https://florianesser.ch/posts/20220714-arch-install-zbm/][Guide: Install Arch Linux on encrypted zpool with ZFSBootMenu]]
+
+* Action Items for archzfs
+
+Based on this research, potential improvements:
+
+1. [ ] Consider adding ZFSBootMenu as alternative bootloader option
+2. [ ] Evaluate zrepl for systems needing frequent time-based snapshots
+3. [ ] Document the grub-zfs-snap vs ZFSBootMenu trade-offs
+4. [ ] Consider profile-based configuration for different use cases
+5. [ ] Add sanoid configuration to archsetup for automated snapshot retention
diff --git a/docs/session-context.org b/docs/session-context.org
new file mode 100644
index 0000000..2cf29bd
--- /dev/null
+++ b/docs/session-context.org
@@ -0,0 +1,52 @@
+#+TITLE: Session Context
+#+DATE: 2026-01-22
+
+* Session: Thursday 2026-01-22 21:37 CST - ongoing
+
+** Current Task
+Creating implementation plan to replace GRUB with ZFSBootMenu in install-archzfs.
+
+** Status
+Plan written and updated with research findings.
+
+** Work Completed This Session
+
+1. Read protocols.org and NOTES.org
+2. Ran session startup workflow
+3. Found inbox item: instructions.txt (AMD GPU fix guide from earlier session)
+4. Created detailed ZFSBootMenu implementation plan
+5. Researched 5 comparable open-source projects:
+ - eoli3n/archiso-zfs + arch-config
+ - PandaScience/arch-on-zfs
+ - sandreas/zarch
+ - danboid/ALEZ
+ - danfossi/Arch-ZFS-Root-Installation-Script
+6. Updated plan with best practices from research
+
+** Key Corrections from Research
+
+CRITICAL: The original plan incorrectly proposed creating a /boot dataset.
+All researched projects agree: /boot must be a DIRECTORY inside ROOT/default,
+NOT a separate ZFS dataset. This ensures snapshots include the kernel.
+
+Other improvements adopted:
+- Set org.zfsbootmenu:commandline on ROOT parent (not ROOT/default) for inheritance
+- Add ZFSBootMenu EFI parameters: zbm.timeout, zbm.prefer, zbm.import_policy
+- Copy hostid to installed system
+- Set bootfs pool property
+
+** Files Created/Modified
+
+- PLAN-zfsbootmenu-implementation.org - Main implementation plan (project root)
+- docs/session-context.org - This file
+
+** Inbox Status
+
+1 item pending: instructions.txt (AMD GPU fix guide)
+- Recommendation: file to docs/2026-01-22-ratio-amd-gpu-freeze-fix-instructions.org
+
+** Next Steps
+
+1. File inbox item (instructions.txt)
+2. Decide whether to implement the ZFSBootMenu plan now or later
+3. If implementing: create git branch, follow plan steps, test