aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/design/2026-06-30-captive-portal-login.org89
-rw-r--r--todo.org10
2 files changed, 97 insertions, 2 deletions
diff --git a/docs/design/2026-06-30-captive-portal-login.org b/docs/design/2026-06-30-captive-portal-login.org
new file mode 100644
index 0000000..1739689
--- /dev/null
+++ b/docs/design/2026-06-30-captive-portal-login.org
@@ -0,0 +1,89 @@
+#+TITLE: Captive-portal login — learnings + baking it into the net panel
+#+DATE: 2026-06-30
+#+SOURCE: the 2026-06-30 Hyatt wifi saga (velox)
+
+* Why this exists
+
+On a locked-down-DNS laptop, captive portals never show their login page, even
+though phones get on fine. We spent hours on a Hyatt portal before finding the
+mechanism; this captures it so the fix becomes a panel feature instead of a
+one-off script.
+
+* The mechanism (what actually blocks the login)
+
+A redirect portal works by *DNS hijack*: you query a name, the hotel's resolver
+hands back the portal, you get the login page. Two things on velox stop that:
+
+- *System resolver forces DNS-over-TLS.* =/etc/systemd/resolved.conf.d/dns-over-tls.conf=
+ hardcodes =DNS=1.1.1.1#... 9.9.9.9#...= with =DNSOverTLS=yes=. The system never
+ queries the hotel's resolver at all. The hotel blocks 853 (DoT) and external
+ 53, so system DNS is simply dead on the portal — only 443 (DoH) gets out.
+- *Browser DoH.* Chrome "secure DNS" on bypasses the hotel DNS too, so the
+ browser never gets redirected either.
+
+A phone works because it uses *plain DNS* from the hotel plus a built-in
+captive-portal popper. The laptop has neither.
+
+Confirmed facts from the saga:
+- Front desk: it's a normal redirect-to-login portal. Phone: connects fine.
+- No DHCP option 114 (RFC 8910) — the portal doesn't advertise its URL. But the
+ URL is recoverable from the HTTP 302 once you're on plain DNS.
+- The walled garden whitelists OS captive-detection endpoints
+ (=captive.apple.com= returns "Success") — a *misleading* signal, not real
+ internet. Don't trust it.
+- 443/DoH egress works broadly on the portal; only port-53 DNS is held. So
+ "system DNS fails" never means "no internet" here.
+
+* The working fix (=~/.local/bin/hotel-wifi=, to be folded in)
+
+Temporarily disable DoT → plain hotel DNS → discover the portal URL from the
+redirect → open it in a clean browser profile (no DoH, no stale HSTS/cookies) →
+click the button → restore DoT. Reversible; tested to restore cleanly.
+
+#+begin_src sh
+#!/bin/sh
+# hotel-wifi disable DoT -> find the portal login URL -> open it
+# hotel-wifi off restore normal encrypted DNS (run once online)
+conf=/etc/systemd/resolved.conf.d/dns-over-tls.conf
+if [ "${1:-on}" = "off" ]; then
+ [ -f "$conf.captive-disabled" ] && sudo mv "$conf.captive-disabled" "$conf"
+ sudo systemctl restart systemd-resolved
+ echo "Encrypted DNS (DoT) restored."; exit 0
+fi
+[ -f "$conf" ] && sudo mv "$conf" "$conf.captive-disabled"
+sudo systemctl restart systemd-resolved; sleep 1
+resolvectl flush-caches 2>/dev/null || true
+portal=""
+for t in http://captive.apple.com/hotspot-detect.html http://neverssl.com \
+ http://detectportal.firefox.com/canonical.html; do
+ loc=$(curl -sS -m 6 -o /dev/null -w '%{redirect_url}' "$t" 2>/dev/null)
+ [ -n "$loc" ] && { portal="$loc"; break; }
+ url=$(curl -sS -m 6 "$t" 2>/dev/null | grep -ioE 'https?://[^"'"'"' >]+' \
+ | grep -ivE 'apple\.com|neverssl|firefox|w3\.org|gstatic' | head -1)
+ [ -n "$url" ] && { portal="$url"; break; }
+done
+prof=$(mktemp -d)
+setsid -f google-chrome-stable --user-data-dir="$prof" "${portal:-http://neverssl.com}" >/dev/null 2>&1
+echo "Click the login button. When online: hotel-wifi off"
+#+end_src
+
+* Baking it into the net panel (the task)
+
+- The net engine already diagnoses captive / no-internet. When it sees a held
+ portal, the panel should offer a first-class *"Log in to this network"*
+ action that runs the plain-DNS + clean-browser flow above, reversibly, and
+ auto-restores DoT when connectivity returns (or on a timeout).
+- Reconcile with the existing =net portal= command and the =captive= helper —
+ they assumed a DNS-hijack-to-gateway model that did NOT match this portal
+ (gateway served no web; DNS was held, not hijacked-to-portal). The plain-DNS
+ approach is the one that worked; make it the engine's portal path.
+- The DoT toggle must be safe and reversible (the =off= step). Consider a
+ per-connection or time-boxed DoT-off that can't strand encrypted DNS.
+- Surface the misleading-"Success" lesson: a whitelisted captive-check passing
+ is not "online" — gate on a real, non-whitelisted fetch.
+
+* Related fix that unblocked the panel (already shipped)
+
+The panel could never switch networks because =net up= placed =--wait= after the
+nmcli subcommand (it's a global option). Fixed in dotfiles 2432311; fake-nmcli
+now rejects the misplaced flag so it can't regress.
diff --git a/todo.org b/todo.org
index 98c6ed3..0ac7e3d 100644
--- a/todo.org
+++ b/todo.org
@@ -21,10 +21,16 @@ The vocabulary is open — topic tags are coined as needed — so these are conv
- *Effort / autonomy*: =:quick:= a spare-moment fix (minutes, not a sitting); =:solo:= Claude can carry it end to end — there's a build path, a test path, and no upfront decision needed (a leftover manual spot-check doesn't disqualify it).
- *Topic / area* (open): the subsystem a task touches — e.g. =:hyprland:= =:waybar:= =:mpd:= =:music:= =:network:= =:tooling:= =:llm:= =:eask:= =:pocketbook:= =:cmail:=. Coin a new one when it aids filtering.
* Archsetup Open Work
-** TODO [#B] ZFS pre-pacman snapshot installer step (ZFS-root) :feature:zfs:
+** TODO [#B] Bake captive-portal login into the net panel :feature:network:
+Make the captive-portal login a first-class net-panel feature instead of the one-off =~/.local/bin/hotel-wifi= script. When the engine sees a held portal, offer "Log in to this network" that runs the plain-DNS + clean-browser flow reversibly (disable DoT -> recover the portal URL from the redirect -> open a clean Chrome profile -> restore DoT when online). Reconcile with the existing =net portal= / =captive= helper, whose DNS-hijack-to-gateway model did NOT match the real Hyatt portal.
+
+Full mechanism writeup, the working script, and the integration plan: [[file:docs/design/2026-06-30-captive-portal-login.org]]. From the 2026-06-30 Hyatt saga.
+
+** DONE [#B] ZFS pre-pacman snapshot installer step (ZFS-root) :feature:zfs:
+CLOSED: [2026-06-30 Tue]
Add a ZFS-root-gated installer step that installs the pre-pacman snapshot pacman hook plus a self-pruning =/usr/local/bin/zfs-pre-snapshot= (KEEP=10). The script is hand-placed on velox, not authored by archsetup, so a reinstall loses it; snapshots accumulated unbounded (53 since April) because nothing prunes them and Sanoid ignores non-autosnap_ names. Gate to ZFS-root (velox; ratio is btrfs). Also correct the stale 2026-01-17 security-doc line claiming it's "already in install-archzfs". Needs the hook file (source from velox) and a ZFS-root VM test.
-Design notes and the KEEP=10 script: [[file:docs/design/2026-06-29-zfs-pre-snapshot-installer.org]]. Origin: home handoff 2026-06-29.
+Shipped: =configure_pre_pacman_snapshots()= in boot_ux (late, ZFS-gated) + =scripts/zfs-pre-snapshot=; unit tests for pruning + a Testinfra assertion. VM-verified ZFS install passed 97/0 (test_zfs_pre_pacman_snapshot_hook PASSED). The "stale doc" turned out accurate (it's an install-archzfs archive) — left as-is. Design notes and the KEEP=10 script: [[file:docs/design/2026-06-29-zfs-pre-snapshot-installer.org]]. Origin: home handoff 2026-06-29.
** TODO [#B] Consistent red=off across waybar toggle modules :waybar:
Extend the red=off convention (just added to the touchpad/mouse indicator) to the other toggles — sound volume, microphone mute, and caffeine — so a disabled / muted / off state reads red across the board. Skip the "cross"/slash; the color alone carries it. Origin: roam inbox capture.