================================================================================ ARCHZFS RESCUE GUIDE ================================================================================ This guide covers common rescue and recovery scenarios. For quick command reference, use: tldr Table of Contents: 1. ZFS Recovery 2. Data Recovery 3. Boot Repair 4. Windows Recovery 5. Hardware Diagnostics 6. Disk Operations 7. Network Troubleshooting ================================================================================ 1. ZFS RECOVERY ================================================================================ QUICK REFERENCE --------------- tldr zfs # ZFS filesystem commands tldr zpool # ZFS pool commands man zfs # Full ZFS manual man zpool # Full zpool manual SCENARIO: Import a pool from another system ------------------------------------------- List pools available for import: zpool import Import a specific pool: zpool import poolname If the pool was not cleanly exported (e.g., system crash): zpool import -f poolname Import with a different name (to avoid conflicts): zpool import oldname newname SCENARIO: Pool won't import - "pool may be in use" -------------------------------------------------- Force import (use when you know it's safe): zpool import -f poolname If that fails, try recovery mode: zpool import -F poolname Last resort - import read-only to recover data: zpool import -o readonly=on poolname SCENARIO: Check pool health and repair -------------------------------------- Check pool status: zpool status poolname Start a scrub (checks all data, can take hours): zpool scrub poolname Check scrub progress: zpool status poolname Clear transient errors after fixing hardware: zpool clear poolname SCENARIO: Recover from snapshot / Rollback ------------------------------------------ List all snapshots: zfs list -t snapshot Rollback to a snapshot (destroys changes since snapshot): zfs rollback poolname/dataset@snapshot For snapshots with intermediate snapshots, use -r: zfs rollback -r poolname/dataset@snapshot SCENARIO: Copy data from ZFS pool --------------------------------- Mount datasets if not auto-mounted: zfs mount -a Or mount specific dataset: zfs set mountpoint=/mnt/recovery poolname/dataset zfs mount poolname/dataset Copy with rsync (preserves permissions, shows progress): rsync -avP --progress /mnt/recovery/ /destination/ SCENARIO: Send/Receive snapshots (backup/migrate) ------------------------------------------------- Create a snapshot first: zfs snapshot poolname/dataset@backup Send to a file (local backup): zfs send poolname/dataset@backup > /path/to/backup.zfs Send with progress indicator: zfs send poolname/dataset@backup | pv > /path/to/backup.zfs Send to another pool locally: zfs send poolname/dataset@backup | zfs recv newpool/dataset Send to remote system over SSH: zfs send poolname/dataset@backup | ssh user@remote zfs recv pool/dataset With progress and buffering for network transfers: zfs send poolname/dataset@backup | pv | mbuffer -s 128k -m 1G | \ ssh user@remote "mbuffer -s 128k -m 1G | zfs recv pool/dataset" SCENARIO: Encrypted pool - unlock and mount ------------------------------------------- Load the encryption key (will prompt for passphrase): zfs load-key poolname Or for all encrypted datasets: zfs load-key -a Then mount: zfs mount -a SCENARIO: Replace failed drive in mirror/raidz ---------------------------------------------- Check which drive failed: zpool status poolname Replace the drive (assuming /dev/sdc is new drive): zpool replace poolname /dev/old-drive /dev/sdc Monitor resilver progress: zpool status poolname SCENARIO: See what's using a dataset (before unmount) ----------------------------------------------------- Check what processes have files open: lsof /mountpoint Or for all ZFS mounts: lsof | grep poolname USEFUL ZFS COMMANDS ------------------- zpool status # Pool health overview zpool list # Pool capacity zpool history poolname # Command history zfs list # All datasets zfs list -t snapshot # All snapshots zfs get all poolname # All properties zdb -l /dev/sdX # Low-level pool label info ================================================================================ 2. DATA RECOVERY ================================================================================ QUICK REFERENCE --------------- tldr ddrescue # Clone failing drives tldr testdisk # Partition/file recovery tldr photorec # Recover deleted files by type tldr smartctl # Check drive health FIRST: Assess drive health before recovery ------------------------------------------ Check if drive is failing (SMART data): smartctl -H /dev/sdX # Quick health check smartctl -a /dev/sdX # Full SMART report Key things to look for: - "PASSED" vs "FAILED" health status - Reallocated_Sector_Ct - bad sectors remapped (increasing = dying) - Current_Pending_Sector - sectors waiting to be remapped - Offline_Uncorrectable - sectors that couldn't be read If SMART shows problems, STOP and use ddrescue immediately. Do not run fsck or other tools that write to a failing drive. SCENARIO: Clone a failing drive (CRITICAL - do this first!) ------------------------------------------------------------ Golden rule: NEVER work directly on a failing drive. Clone it first, then recover from the clone. Clone to an image file (safest): ddrescue -d -r3 /dev/sdX /path/to/image.img /path/to/logfile.log -d = direct I/O, bypass cache -r3 = retry bad sectors 3 times logfile = allows resuming if interrupted Clone to another drive: ddrescue -d -r3 /dev/sdX /dev/sdY /path/to/logfile.log Monitor progress (ddrescue shows its own progress, but for pipes): ddrescue -d /dev/sdX - 2>/dev/null | pv > /path/to/image.img Resume an interrupted clone: ddrescue -d -r3 /dev/sdX /path/to/image.img /path/to/logfile.log The log file tracks what's been copied. Same command resumes. If drive is very bad, do a quick pass first, then retry bad sectors: ddrescue -d -n /dev/sdX image.img logfile.log # Fast pass, skip errors ddrescue -d -r3 /dev/sdX image.img logfile.log # Retry bad sectors SCENARIO: Recover deleted files (PhotoRec) ------------------------------------------ PhotoRec recovers files by their content signatures, not filesystem. Works even if filesystem is damaged or reformatted. Run PhotoRec (included with testdisk): photorec /dev/sdX # From device photorec image.img # From disk image Interactive steps: 1. Select the disk/partition 2. Choose filesystem type (usually "Other" for FAT/NTFS/exFAT) 3. Choose "Free" (unallocated) or "Whole" (entire partition) 4. Select destination folder for recovered files 5. Wait (can take hours for large drives) Recovered files are named by type (e.g., f0001234.jpg) in recup_dir.*/ SCENARIO: Recover lost partition / Fix partition table ------------------------------------------------------ TestDisk can find and recover lost partitions. Run TestDisk: testdisk /dev/sdX # From device testdisk image.img # From disk image Interactive steps: 1. Select disk 2. Select partition table type (usually Intel/PC for MBR, EFI GPT) 3. Choose "Analyse" to scan for partitions 4. "Quick Search" finds most partitions 5. "Deeper Search" if quick search misses any 6. Review found partitions, select ones to recover 7. "Write" to save new partition table (or just note the info) TestDisk can also: - Recover deleted files from FAT/NTFS/ext filesystems - Repair FAT/NTFS boot sectors - Rebuild NTFS MFT SCENARIO: Recover specific file types (Foremost) ------------------------------------------------ Foremost carves files based on headers/footers. Useful when PhotoRec doesn't find what you need. Basic usage: foremost -t all -i /dev/sdX -o /output/dir foremost -t all -i image.img -o /output/dir Specific file types: foremost -t jpg,png,gif -i image.img -o /output/dir foremost -t pdf,doc,xls -i image.img -o /output/dir Supported types: jpg, gif, png, bmp, avi, exe, mpg, wav, riff, wmv, mov, pdf, ole (doc/xls/ppt), doc, zip, rar, htm, cpp, all SCENARIO: Can't mount filesystem - try repair ---------------------------------------------- WARNING: Only run fsck on a COPY, not the original failing drive! For ext2/ext3/ext4: fsck.ext4 -n /dev/sdX # Check only, no changes (safe) fsck.ext4 -p /dev/sdX # Auto-repair safe problems fsck.ext4 -y /dev/sdX # Say yes to all repairs (risky) For NTFS: ntfsfix /dev/sdX # Fix common NTFS issues For XFS: xfs_repair -n /dev/sdX # Check only xfs_repair /dev/sdX # Repair For FAT32: fsck.fat -n /dev/sdX # Check only fsck.fat -a /dev/sdX # Auto-repair SCENARIO: Mount a disk image for file access --------------------------------------------- Mount a full disk image (find partitions first): fdisk -l image.img # List partitions and offsets Note the "Start" sector of the partition you want, multiply by 512: mount -o loop,offset=$((START*512)) image.img /mnt/recovery Or use losetup to set up loop devices for all partitions: losetup -P /dev/loop0 image.img mount /dev/loop0p1 /mnt/recovery For NTFS images: mount -t ntfs-3g -o loop,offset=$((START*512)) image.img /mnt/recovery SCENARIO: Low-level recovery from very bad drives (safecopy) ------------------------------------------------------------ Safecopy is more aggressive than ddrescue for very damaged media. Use when ddrescue can't make progress. safecopy /dev/sdX image.img With multiple passes (increasingly aggressive): safecopy --stage1 /dev/sdX image.img # Quick pass safecopy --stage2 /dev/sdX image.img # Retry errors safecopy --stage3 /dev/sdX image.img # Maximum recovery DATA RECOVERY TIPS ------------------ 1. STOP using a failing drive immediately - every access risks more damage 2. Clone first, recover from clone - never work on original 3. Keep the log file from ddrescue - allows resuming 4. Recover to a DIFFERENT drive - never same drive 5. For deleted files on working drive, unmount immediately to prevent overwriting the deleted data 6. If drive makes clicking/grinding noises, consider professional recovery 7. For SSDs, TRIM may have already zeroed deleted blocks - recovery harder ================================================================================ 3. BOOT REPAIR ================================================================================ QUICK REFERENCE --------------- tldr grub-install # Install GRUB bootloader tldr efibootmgr # Manage UEFI boot entries tldr arch-chroot # Chroot into installed system man mkinitcpio # Rebuild initramfs FIRST: Identify your boot mode ------------------------------ Check if system is UEFI or Legacy BIOS: ls /sys/firmware/efi # If exists, you're in UEFI mode If booting from this rescue USB in UEFI mode, you need to fix UEFI. If booting in Legacy mode, you need to fix MBR/Legacy boot. SCENARIO: Chroot into broken system (preparation for most repairs) ------------------------------------------------------------------ This is the foundation for most boot repairs. 1. Find your partitions: lsblk -f # Shows filesystems and labels 2. Mount the root filesystem: mount /dev/sdX2 /mnt # Replace with your root partition For ZFS root: zpool import -R /mnt zroot zfs mount -a 3. Mount required system directories: mount /dev/sdX1 /mnt/boot # EFI partition (if separate) mount --bind /dev /mnt/dev mount --bind /proc /mnt/proc mount --bind /sys /mnt/sys mount --bind /sys/firmware/efi/efivars /mnt/sys/firmware/efi/efivars Or use arch-chroot (handles mounts automatically): arch-chroot /mnt 4. Now you can run commands as if booted into the system. SCENARIO: Reinstall GRUB (UEFI) ------------------------------- After chrooting into the system: grub-install --target=x86_64-efi --efi-directory=/boot --bootloader-id=GRUB If EFI partition is mounted elsewhere: grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=GRUB Regenerate GRUB config: grub-mkconfig -o /boot/grub/grub.cfg SCENARIO: Reinstall GRUB (Legacy BIOS/MBR) ------------------------------------------ After chrooting into the system: grub-install --target=i386-pc /dev/sdX # Note: device, not partition Regenerate GRUB config: grub-mkconfig -o /boot/grub/grub.cfg SCENARIO: Fix UEFI boot entries ------------------------------- List current boot entries: efibootmgr -v Delete a broken entry (replace XXXX with boot number): efibootmgr -b XXXX -B Create a new boot entry: efibootmgr --create --disk /dev/sdX --part 1 --label "Arch Linux" \ --loader /EFI/GRUB/grubx64.efi Change boot order (comma-separated boot numbers): efibootmgr -o 0001,0002,0003 Set next boot only: efibootmgr -n 0001 SCENARIO: Rebuild initramfs (kernel panic, missing modules) ----------------------------------------------------------- After chrooting into the system: List available presets: ls /etc/mkinitcpio.d/ Rebuild for specific kernel: mkinitcpio -p linux # Standard kernel mkinitcpio -p linux-lts # LTS kernel Rebuild all: mkinitcpio -P Check mkinitcpio.conf for ZFS: grep "^HOOKS" /etc/mkinitcpio.conf For ZFS, HOOKS should include 'zfs': HOOKS=(base udev autodetect modconf block zfs filesystems keyboard fsck) SCENARIO: GRUB not detecting Windows (dual-boot) ------------------------------------------------ After chrooting into the system: Enable os-prober in GRUB config: echo 'GRUB_DISABLE_OS_PROBER=false' >> /etc/default/grub Mount the Windows EFI partition if not already mounted. Regenerate GRUB config: grub-mkconfig -o /boot/grub/grub.cfg os-prober should find Windows and add it to the menu. SCENARIO: Restore Windows MBR (remove GRUB, restore Windows boot) ----------------------------------------------------------------- If you need to remove Linux and restore Windows-only MBR: ms-sys -w /dev/sdX # Write Windows 7+ MBR Other options: ms-sys -7 /dev/sdX # Windows 7 MBR specifically ms-sys -i /dev/sdX # Show current MBR type SCENARIO: Install syslinux (lightweight alternative to GRUB) ------------------------------------------------------------ For Legacy BIOS: syslinux-install_update -i -a -m For UEFI, copy the EFI binary: cp /usr/lib/syslinux/efi64/* /boot/EFI/syslinux/ Create /boot/syslinux/syslinux.cfg with boot entries. SCENARIO: Can't boot - kernel panic with ZFS -------------------------------------------- Common causes: 1. ZFS module not in initramfs - rebuild with mkinitcpio 2. Pool name changed - check zpool.cache 3. hostid mismatch - regenerate hostid After chrooting: Check if ZFS hook is present: grep zfs /etc/mkinitcpio.conf Regenerate hostid if needed: zgenhostid $(hostid) Rebuild initramfs: mkinitcpio -P SCENARIO: Emergency boot from GRUB command line ----------------------------------------------- If GRUB loads but config is broken, press 'c' for command line: For Linux (non-ZFS): set root=(hd0,gpt2) linux /boot/vmlinuz-linux root=/dev/sda2 initrd /boot/initramfs-linux.img boot For Linux with ZFS root: set root=(hd0,gpt1) linux /vmlinuz-linux-lts root=ZFS=zroot/ROOT/default initrd /initramfs-linux-lts.img boot Tab completion works in GRUB command line! BOOT REPAIR TIPS ---------------- 1. Always backup your current EFI partition before making changes 2. Use 'efibootmgr -v' to see full paths and verify entries 3. Some UEFI firmwares are picky about the bootloader path - try /EFI/BOOT/BOOTX64.EFI as a fallback 4. If all else fails, most UEFI has a boot menu (F12, F8, Esc at POST) 5. GRUB reinstall usually fixes most boot issues 6. For ZFS, the initramfs must include the zfs hook ================================================================================ 4. WINDOWS RECOVERY ================================================================================ QUICK REFERENCE --------------- tldr chntpw # Reset Windows passwords tldr ntfs-3g # Mount NTFS filesystems man dislocker # Access BitLocker drives man hivexregedit # Edit Windows registry FIRST: Identify and mount the Windows partition ----------------------------------------------- Find Windows partition: lsblk -f # Look for "ntfs" filesystem fdisk -l # Look for "Microsoft basic data" type Check if BitLocker encrypted: lsblk -f # Will show "BitLocker" instead of "ntfs" Mount NTFS partition (read-write): mkdir -p /mnt/windows mount -t ntfs-3g /dev/sdX1 /mnt/windows If Windows wasn't shut down cleanly (hibernation/fast startup): mount -t ntfs-3g -o remove_hiberfile /dev/sdX1 /mnt/windows Read-only mount (safer): mount -t ntfs-3g -o ro /dev/sdX1 /mnt/windows SCENARIO: Reset forgotten Windows password ------------------------------------------ Mount the Windows partition first (see above). Navigate to the SAM database: cd /mnt/windows/Windows/System32/config List all users: chntpw -l SAM Reset password for a specific user (interactive): chntpw -u "Username" SAM In the interactive menu: 1. Clear (blank) user password <-- Recommended 2. Unlock and enable user account 3. Promote user to administrator q. Quit After making changes, type 'q' to quit, then 'y' to save. Alternative - blank ALL passwords: chntpw -i SAM # Interactive mode, select options SCENARIO: Unlock disabled/locked Windows account ------------------------------------------------ cd /mnt/windows/Windows/System32/config chntpw -u "Username" SAM Select option 2: "Unlock and enable user account" SCENARIO: Promote user to Administrator --------------------------------------- cd /mnt/windows/Windows/System32/config chntpw -u "Username" SAM Select option 3: "Promote user (make user an administrator)" SCENARIO: Access BitLocker encrypted drive ------------------------------------------ You MUST have either: - The BitLocker password, OR - The 48-digit recovery key Find your recovery key: - Microsoft account: account.microsoft.com/devices/recoverykey - Printed/saved during BitLocker setup - Active Directory (for domain-joined PCs) Decrypt with password: mkdir -p /mnt/bitlocker-decrypted /mnt/windows dislocker -V /dev/sdX1 -u -- /mnt/bitlocker-decrypted # Enter password when prompted Decrypt with recovery key: dislocker -V /dev/sdX1 -p123456-789012-345678-901234-567890-123456-789012-345678 -- /mnt/bitlocker-decrypted Now mount the decrypted volume: mount -t ntfs-3g /mnt/bitlocker-decrypted/dislocker-file /mnt/windows When done: umount /mnt/windows umount /mnt/bitlocker-decrypted SCENARIO: Copy files from Windows that won't boot ------------------------------------------------- Mount the Windows partition (see above), then: Copy specific files/folders: cp -r "/mnt/windows/Users/Username/Documents" /destination/ Copy with rsync (shows progress, preserves attributes): rsync -avP "/mnt/windows/Users/Username/" /destination/ Common locations for user data: /mnt/windows/Users/Username/Desktop/ /mnt/windows/Users/Username/Documents/ /mnt/windows/Users/Username/Downloads/ /mnt/windows/Users/Username/Pictures/ /mnt/windows/Users/Username/AppData/ (hidden app data) SCENARIO: Edit Windows Registry ------------------------------- The registry is stored in several hive files: SYSTEM - Hardware, services, boot config SOFTWARE - Installed programs, system settings SAM - User accounts (password hashes) SECURITY - Security policies DEFAULT - Default user profile NTUSER.DAT - Per-user settings (in each user's profile) View registry contents: hivexregedit --export /mnt/windows/Windows/System32/config/SYSTEM '\' > system.reg Merge changes from a .reg file: hivexregedit --merge /mnt/windows/Windows/System32/config/SOFTWARE changes.reg Interactive registry shell: hivexsh /mnt/windows/Windows/System32/config/SYSTEM # Commands: cd, ls, lsval, cat, exit SCENARIO: Fix Windows boot (from Linux) --------------------------------------- Sometimes you can fix Windows boot issues from Linux: Rebuild BCD (Windows Boot Configuration Data): - This usually requires Windows Recovery Environment - From Linux, you can backup/restore the BCD file: cp /mnt/windows/Boot/BCD /mnt/windows/Boot/BCD.backup Restore Windows bootloader to MBR (if GRUB overwrote it): ms-sys -w /dev/sdX # Write Windows 7+ compatible MBR For UEFI systems, Windows boot files are in: /mnt/efi/EFI/Microsoft/Boot/ SCENARIO: Scan Windows for malware (offline scan) ------------------------------------------------- Update ClamAV definitions first (requires internet): freshclam Scan the Windows partition: clamscan -r /mnt/windows # Basic scan clamscan -r -i /mnt/windows # Only show infected files clamscan -r --move=/quarantine /mnt/windows # Quarantine infected Scan common malware locations: clamscan -r "/mnt/windows/Users/*/AppData" clamscan -r "/mnt/windows/Windows/Temp" clamscan -r "/mnt/windows/ProgramData" Note: ClamAV detection isn't as comprehensive as commercial AV. Best for known malware; may miss new/sophisticated threats. SCENARIO: Disable Windows Fast Startup (to mount NTFS read-write) ----------------------------------------------------------------- Windows 8+ uses "Fast Startup" (hybrid shutdown) by default. This leaves NTFS in a "dirty" state, preventing safe writes from Linux. Option 1: Force mount (may cause issues): mount -t ntfs-3g -o remove_hiberfile /dev/sdX1 /mnt/windows Option 2: Boot Windows and disable Fast Startup: - Control Panel > Power Options > "Choose what the power buttons do" - Click "Change settings that are currently unavailable" - Uncheck "Turn on fast startup" - Shutdown (not restart) Windows Option 3: Via registry from Linux: hivexregedit --merge /mnt/windows/Windows/System32/config/SYSTEM << 'EOF' Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Power] "HiberbootEnabled"=dword:00000000 EOF WINDOWS RECOVERY TIPS --------------------- 1. Always try mounting read-only first to assess the situation 2. Windows Fast Startup/hibernation prevents safe NTFS writes 3. BitLocker recovery key is essential - no key = no access 4. chntpw blanks passwords; it cannot recover/show old passwords 5. Back up registry hives before editing them 6. If Windows is bootable but locked out, just reset the password 7. For serious Windows issues, Windows Recovery Environment may be needed 8. Some antivirus/security software may re-lock accounts on next boot ================================================================================ 5. HARDWARE DIAGNOSTICS ================================================================================ [To be added] ================================================================================ 6. DISK OPERATIONS ================================================================================ [To be added] ================================================================================ 7. NETWORK TROUBLESHOOTING ================================================================================ [To be added] ================================================================================ END OF GUIDE ================================================================================