aboutsummaryrefslogtreecommitdiff
path: root/docs/design/2026-06-29-waybar-network-module-spec.org
blob: db9657dc8c9353e472bf294dc956b52f18e1c3c2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
#+TITLE: Waybar Network Module — Design Spec
#+AUTHOR: Craig Jennings & Claude
#+DATE: 2026-06-29

* Status

*Phase 1 SHIPPED* (2026-06-29, dotfiles =5254bd8=..=c095a22=, 10 commits): engine
(=net status/probe/diagnose/repair/doctor/portal=) + =waybar-net= indicator +
split-cadence cache + redacted event log + Makefile recovery targets + airplane
absorption. 160 net tests; pure modules ≥90% branch. One as-built deviation from
this spec: airplane absorption is *display-only* (Craig's call, option 1) — net
shows the airplane state but the =airplane-mode= low-power toggle is KEPT (it does
radios + CPU + brightness + services, not a network concern); only =waybar-airplane=
+ =custom/airplane= + =waybar-netspeed= were deleted. See decision 12. Phases 2-5
remain. Live waybar eyeball is under todo.org "Manual testing and validation".

Ready for Phase 1; Ready-with-caveats overall. Three Codex review rounds + Craig's
cj comments are all incorporated — every finding has a disposition and the findings
cookie reads complete ([31/31]), with no open decisions (enterprise scope settled:
open + WPA-PSK in v1, 802.1X add/edit vNext, activate-only). The cj comments
reshaped several decisions (no separate credential store — use NM's own; =net
doctor= + Makefile console-recovery in v1; rfkill + full-stack-bounce repair;
airplane module absorbed; VPN a later Phase 5). The only remaining caveats are
Phase-2/3 build unknowns named under Open items (gtk4-layer-shell anchoring, the
=captive= =--json= refactor) — not Phase-1 blockers. Phase 1 (indicator + console
recovery) is ready to build.

* Goal

One waybar network component that does the whole job: shows connection state
(including the missing "associated but no internet / captive portal" state),
manages connections from a dropdown (nmcli-backed; secrets stay in
NetworkManager's own store, no separate credential file), and runs the network
diagnostics and remediation off the same place
(captive-portal detection + forcing, bounce/reset, gateway/DNS checks, speed
test).

It unifies three todo tasks that are really one feature:
- =[#C]= "archsetup Waybar Wi-Fi module should show no-internet state" — the
  indicator state plus the 2026-06-22 roam expansion (bounce, diagnostics, speed
  test off the component).
- =[#B]= "Network-manager dropdown, nmcli-backed" — the management dropdown. (The
  todo task's original "GPG-stored secrets" framing is superseded: secrets stay in
  NM's own store, decision 5.)
- The network diagnostics already shipped in =captive= (the hotel/captive-portal
  tool, formerly =login-page=) become this module's diagnostics engine rather
  than a standalone CLI.

* Scope

** In
- *Indicator* — wifi/ethernet icon + signal + SSID, plus an internet sub-state:
  online / captive / no-internet / connecting / disconnected / airplane.
- *Absorbs the airplane module* — the airplane state + toggle move into
  =custom/net= (airplane is a network concern). Once this ships, the standalone
  =custom/airplane= module, the =waybar-airplane= + =airplane-mode= scripts, their
  =tests/=, and the css are deleted (listed under Files touched). The
  desktop-settings panel (sibling =[#B]=) no longer needs an airplane row.
- *Interface-correct* — targets the wifi (or chosen) device, not the
  default-route interface, so an active USB tether or wired link can't mask
  wifi state. (Same lesson =captive= fixed; the current =custom/netspeed= keys
  off the default route and has the bug.)
- *Connection management (panel)* — list saved connections most-recently-used
  first, live signal for in-range wifi, click to switch; add / edit / remove for
  open + WPA-PSK; activate any existing saved profile (including enterprise ones
  NM already stores); ethernet↔wifi and wifi↔wifi switching even when a link
  appears mid-session.
- *Diagnostics (panel)* — read-only Diagnose (captive probe 204-vs-portal with
  the extracted portal URL, gateway ping, DNS config) separated from mutating
  Repair. Repair has tiers, lightest first: rfkill-unblock, per-connection reset
  (fresh MAC), full-stack bounce (=nmcli networking off/on=, then restart
  NetworkManager if that fails), and the temporary 1.1.1.1 override test. Each
  Repair action confirms and verifies cleanup.
- *Speed test (panel)* — down/up/ping with a progress indicator and last-result
  shown, via the already-installed =speedtest-go --json=.
- *Connection secrets* — none of our own. Settings and passwords live where NM
  already keeps them: =/etc/NetworkManager/system-connections/*.nmconnection=
  (root-only =0600=, the PSK/EAP secret stored inline). We read/write them through
  nmcli, which handles the privilege. No separate file, no GPG, no gpg-agent — one
  fewer dependency, and NM's store is already the secure-at-rest source of truth.
- *Persistence* — connectivity probe result cached in the runtime dir so the
  bar reads it cheaply between probes.
- *Observability* — a redacted JSONL event log so a post-failure session can
  diagnose without re-running destructive actions.

** Out (v1, note for later)
- No replacement of NetworkManager's connection engine. NM stays the thing that
  connects; we drive it via nmcli.
- No add/edit *form* for WPA-Enterprise / 802.1X in v1. The reason is effort vs
  payoff: 802.1X has many interdependent fields (CA cert, client cert, identity,
  anonymous identity, phase-2 auth) where a wrong entry silently fails auth, so a
  trustworthy form is a lot of UI for connections Craig rarely adds (open +
  WPA-PSK covers home, hotels, and phone hotspots). v1 still *activates* existing
  saved enterprise profiles and points editing at =nmtui=/=nmcli=. Settled
  (Craig, 2026-06-29): enterprise add/edit is vNext — 24 saved profiles on velox,
  0 enterprise, so the form would be unused UI; if one ever appears nmtui adds it
  once and the module activates it thereafter.
- No per-connection captive-portal *auto-login* in v1. (That would mean storing a
  portal's login form answers — room number, surname, a checkbox — and replaying
  them automatically when a known portal is detected, so the page never appears.
  Out for v1 because every portal's form differs and it means storing per-venue
  answers; v1 just opens the portal for you.)
- No graphing/history of speed-test results beyond the last run.
- No static-IP / proxy / metered / MAC-randomization editing in v1 (activate
  existing, edit elsewhere).
- No VPN / WireGuard management in v1, but it's a planned later phase (Phase 5),
  not a permanent exclusion — it folds the existing archsetup wireguard tooling
  into the same panel/CLI.
- The desktop-settings dropdown (sibling =[#B]=) is a separate module, but it
  shares the GTK4 layer-shell panel shell built here.

* Architecture

Three layers. Keep the bar cheap, the panel rich, the logic in one tested place.

1. *Engine* — a =net= Python package (src-layout, unittest), exposing a CLI. Wraps
   every nmcli op and owns the diagnostics. Emits JSON. This is the testable
   core (fake =nmcli= / =curl= / =speedtest-go= on PATH, like the existing
   =waybar-netspeed= and =waybar-sysmon= test harnesses). Precedent: pocketbook is
   Python in the dotfiles repo; =wtimer= is Python for the same testability
   reason.
2. *Indicator* — a thin =waybar-net= script that calls =net status --json= and
   renders icon + signal + state + tooltip. Replaces =custom/netspeed=
   (throughput folds into the tooltip).
3. *Panel* — a GTK4 + gtk4-layer-shell app (mirrors pocketbook's structure)
   that imports the engine. Hosts connection management, diagnostics, and the
   speed test.

How the existing pieces map in:
- =captive= (bash, shipped) — the engine shells out to it for the heavy,
  interactive portal-force flow (sudo reset, DNS override, browser launch). Its
  cheap portal-detection logic is mirrored natively in the engine for the fast
  status path so the bar never blocks on a subprocess. =captive= stays a usable
  standalone CLI. The refactor (below) extracts its probe + reset into functions
  the engine can call non-interactively.
- =waybar-netspeed= (sh, shipped) — retired; its throughput sampling moves into
  the engine's status output and renders in the indicator tooltip only.
- =nmcli= — the connection backend for every op.

Language note: the engine is Python; the indicator is a thin Python or sh
wrapper over =net status --json=. The bar path must stay fast (see Performance
budgets), so the indicator does no network I/O itself — it reads link state and
the cached connectivity result.

* Repository + dependencies

- *Code lives in the dotfiles repo* (=~/.dotfiles=), not archsetup. The =net=
  package sits in-tree like pocketbook (src-layout, unittest, Makefile target);
  =waybar-net= and the =net= CLI entry live in the hyprland tier
  (=hyprland/.local/bin/=). Tests under =tests/net/= and =tests/waybar-net/=.
  archsetup owns only the *dependency install*, not the code.
- *archsetup installs the deps* in its Hyprland step: =gtk4-layer-shell=,
  =python-gobject=, plus =nmcli=/=curl=/=resolvectl=/=rfkill= (already present via
  NetworkManager/curl/systemd/util-linux). Speed test uses =speedtest-go= (AUR
  =speedtest-go-bin=, already installed on velox); archsetup adds it to the AUR
  list. librespeed-cli is the documented fallback if a self-hosted LibreSpeed
  server is ever wanted. No =gpg= dependency (secrets live in NM's own store).
- *Daily-drivers*: a stowed-script + AUR-dep feature, so ratio needs the same
  =git pull= + stow + the archsetup-added deps. Note the manual dep step in the
  rollout.

** Makefile targets (console recovery is a first-class path)
=net doctor= and the diagnostics are reachable from a bare TTY when waybar and
the GUI are down — that's the case where you most need them. The dotfiles
Makefile carries targets that wrap the =net= CLI so "get back online" is one make
command from the console:
- =make online= — =net doctor --fix= (diagnose, then apply the lightest repair:
  rfkill-unblock → reset → bounce → open portal). The headline recovery target.
- =make net-doctor= — =net doctor= (read-only diagnose + recommendation).
- =make net-status= / =make net-diagnose= / =make net-portal= / =make net-reset=
  / =make net-bounce= — the individual ops.
- =make test= — already runs =tests/*=; the =net= package's unittest suites are
  collected the same way.
These intentionally need only nmcli/curl/rfkill (no GUI, no waybar, no Python
GTK), so they work from a TTY on a broken graphical session.

* Connectivity model — split cadence

The indicator polls every ~2s, but a real internet/captive probe every 2s wastes
battery and can re-trigger a captive portal. So split it:

- *Fast path (every poll, cheap, no network)* — interface, type, SSID, signal,
  IPv4 presence, throughput sample. From nmcli / sysfs only. No network I/O.
- *Slow path (cached, TTL ~45s)* — the actual internet/captive probe (the 204
  check + meta-refresh portal extraction). Result cached at
  =$XDG_RUNTIME_DIR/waybar/net-connectivity.json= with a timestamp.

The indicator reads the cache each poll. When the cache is older than the TTL,
=net status= kicks =net probe= in the background (spawn + detach, never awaited)
and renders the last cached sub-state meanwhile. A user-triggered
diagnose/reconnect refreshes the cache immediately. This keeps the bar
responsive and the portal un-poked.

** Concurrency, atomicity, staleness
- *Single-flight* — =net probe= takes a lock file at
  =$XDG_RUNTIME_DIR/waybar/net-probe.lock= (flock, non-blocking). A second probe
  while one runs is a no-op, so a flapping 2s poll can't pile up overlapping
  probes.
- *Atomic writes* — the cache is written to a temp file + =os.replace= (atomic
  rename), so a reader never sees a half-written cache. Same pattern as =wtimer=.
- *Max probe runtime* — the probe has a hard timeout (≤ 6s total: curl
  =--max-time 5= + slack). On timeout it writes an =unknown= result, never hangs.
- *Stale classes* the indicator distinguishes: fresh (< TTL), stale (TTL..3×TTL,
  shown with a subdued/aging hint), expired (> 3×TTL → treat as unknown),
  unknown (no cache / probe failed). The bar never shows a confident "online"
  past the expired threshold.
- *Invalidation* — the cache records the iface + SSID + active-connection UUID it
  was taken under; a change in any of them invalidates it immediately (a
  reconnect must not show the old network's verdict).
- *Crash cleanup* — a stale lock older than the max runtime is ignored/reclaimed.

* Performance budgets (hot path)

The bar exec path (=waybar-net= → =net status=) must stay responsive:
- *Budget*: =net status= returns in < 100ms typical, < 250ms worst case.
- *No sleeping in the bar path.* Throughput is sampled from two reads of
  =/sys/class/net/<iface>/statistics/{rx,tx}_bytes= across the *waybar poll
  interval itself* (delta since the last cached sample + timestamp), not via an
  in-process =sleep= like the old =waybar-netspeed=. The cache holds the prior
  counters.
- *Subprocess cap*: at most one =nmcli= invocation on the hot path (a single
  =nmcli -t -f ...= multi-field query), plus sysfs reads. Never a per-field
  nmcli call.
- *Every subprocess has a timeout* (=nmcli --wait 2=, =subprocess timeout=). On
  timeout or error the indicator emits a degraded JSON state (class
  =net-degraded=, a neutral glyph) rather than blocking or crashing waybar.
- *Benchmark test*: a fake slow =nmcli= asserts =net status= still returns within
  budget by falling back to the degraded state.

* Engine — =net= CLI surface

All subcommands take =--json= where a machine reads them. Pure formatting/state
functions under the CLI; IO (nmcli, curl, file) at the edges. Every subcommand
exits non-zero with a JSON error envelope (see JSON schemas) on failure.

- =net status [--json] [--iface IF]= — fast link state + cached connectivity
  sub-state + throughput. The indicator's source. Never does network I/O.
- =net probe [--iface IF]= — run the connectivity/captive probe now, update the
  cache (single-flight, atomic), print online | captive (+ portal URL) |
  no-internet | unknown. Mirrors =captive='s cheap detection natively.
- =net list [--json]= — saved connections, MRU order, active flag, plus in-range
  wifi with signal.
- =net up <uuid>= / =net down [--iface IF]= — switch / disconnect. Operates on
  UUID, not name (see nmcli contract).
- =net add= / =net edit <uuid>= / =net remove <uuid>= — manage connections
  (open + WPA-PSK) through nmcli; the secret lands in NM's own
  =.nmconnection=. Enterprise profiles are activate-only.
- =net rescan [--iface IF]= — wifi rescan.
- =net diagnose [--json]= — read-only report: gateway ping, DNS config, captive
  probe. The structured contract below. Doubles as the post-failure snapshot.
- =net repair <action> [--json]= — mutating remediation, lightest first:
  =rfkill= (unblock + radio on), =reset= (fresh MAC), =bounce= (full-stack:
  =nmcli networking off/on=, escalating to =systemctl restart NetworkManager=),
  =dns-test= (temporary 1.1.1.1 override, auto-reverted). Each confirms via the
  caller and verifies cleanup.
- =net doctor [--json] [--fix]= — one-shot "get me online" mode for the console:
  runs the full diagnose, then applies the lightest repair that fits (unblock
  rfkill, reset, bounce, open portal) — read-only without =--fix=, acting with
  it. The TTY recovery path when waybar/the GUI is down (see the Makefile
  targets).
- =net portal= — run =captive='s portal-force flow (reset if needed, extract +
  open the portal page).
- =net speedtest [--json]= — =speedtest-go --json= run; down/up/ping.

* nmcli contract

The command wrapper is the reliability boundary; SSIDs and connection names
contain spaces, colons, duplicates, hidden names, and non-ASCII. Rules:

- *Terse, field-selected output*: =nmcli -t -f <fields> --escape yes ...= and
  =nmcli -g <fields> ...= (get-values) for single-value reads. Parse with the
  documented escaping (=\:= and =\\=); never naive =cut -d:=.
- *UUID is the handle.* Every saved-profile op (=up=, =down=, =modify=, =delete=)
  uses the connection UUID, never the display name — names duplicate and contain
  separators. =net list= surfaces UUIDs; the panel maps row → UUID.
- *Wait budgets*: activation/deactivation use =nmcli --wait <n>= with an explicit
  budget (hot-path reads =--wait 2=; activation =--wait 30=). No unbounded waits.
- *Connectivity*: NM's own =nmcli networking connectivity= can return
  =none/portal/limited/full/unknown=. Use it as a *cheap hint* on the fast path
  when present, but the authoritative captive verdict is still our own probe
  (NM's portal detection is coarser and config-dependent).
- *Parser tests* (fake nmcli fixtures): escaped colons and backslashes in SSIDs,
  embedded newlines, duplicate connection names, hidden SSID (empty name),
  non-ASCII SSID, the wired-appears-mid-session case, and the multi-active case
  (wifi + tether both up).

* JSON schemas

Versioned (="v": 1=) envelopes so tests lock the contract. Sketches (fields
nullable unless noted):

- =status=: ={v, iface, type: wifi|ethernet|none, ssid, signal, ipv4,
  gateway, throughput: {rx_bps, tx_bps}, connectivity: online|captive|no-internet|unknown,
  connectivity_age_s, connectivity_class: fresh|stale|expired|unknown, state:
  online|captive|no-internet|connecting|disconnected|airplane|wired|degraded}=.
- =probe=: ={v, result: online|captive|no-internet|unknown, portal_url, http_code,
  redirect_host, elapsed_ms, ts}=.
- =list=: ={v, connections: [{uuid, name, type, active, last_used, signal,
  in_range, security}]}=.
- =diagnose=: ={v, steps: [<diagnostic step, see contract>], overall:
  ok|warn|fail}=.
- =speedtest=: ={v, down_mbps, up_mbps, ping_ms, server, elapsed_ms, ts}=.
- error envelope (any command): ={v, error: {code, message, detail, partial:
  bool}}= with a non-zero exit.

* Diagnostics contract

=net diagnose --json= returns an ordered list of steps. Each step is the unit the
panel renders and the log records:

- =id= — stable identifier (e.g. =link=, =dhcp=, =gateway-ping=, =dns-config=,
  =dns-resolve=, =http-probe=, =portal=).
- =status= — =pending | running | pass | warn | fail | skipped=.
- =title= — short human label.
- =evidence= — redacted detail (the value seen), per the redaction rules.
- =elapsed_ms=.
- =safety= — =read-only= or =mutating= (diagnose steps are all read-only).
- =next_action= — what the user/agent should do on warn/fail (e.g. "open portal",
  "reset connection", "switch network").

Repair actions (=net repair=) carry the same shape but =safety: mutating=, plus a
=cleanup_verified: bool= field (e.g. the DNS override was reverted) and a
terminal =cleanup-unverified= status when revert can't be confirmed.

** Diagnose vs Repair (read-only vs mutating)
The panel separates them visually and behaviorally:
- *Diagnose* — probe, gateway ping, DNS config read, captive check. No state
  change, no sudo, runnable freely.
- *Repair* — reset (fresh MAC, deletes+recreates the NM profile), DNS override
  test (mutates resolver, auto-reverts), portal force. Each needs an explicit
  confirm, shows that it's privacy/state-changing, and verifies cleanup. A
  Repair whose cleanup can't be verified ends in a visible =cleanup-unverified=
  state, never a silent success.

* Failure states, messages, recovery

Each row below gives the *exact, final* user-facing string (not a template) with
=<placeholders>= for redacted evidence, plus the evidence field included and the
next action. The string is canonical: every surface renders the same text, so
there's one source of truth.

Per-surface rendering of the canonical string:
- *Indicator* — the matching glyph + CSS class; the string is the tooltip
  (untruncated).
- *Notification* (=notify=) — title = the failure label, body = the string.
- *CLI* — the string on stderr; =--json= puts it in =error.message= with the
  evidence in =error.detail= and a stable =error.code=.
- *Panel* — the string as the section banner, with the diagnostic step's evidence
  shown beneath.
Evidence is always redacted per the redaction rules (SSID/host shown; PSK/EAP/
portal tokens never).

- *associated, no DHCP* — "Connected to <SSID>, no IP (DHCP failed)" →
  evidence: SSID, iface → reset / reconnect.
- *no-internet* — "On <SSID>, no internet (gateway reachable, no route out)" →
  diagnose / switch network.
- *captive* — "Captive portal at <host> — login required" → Open portal.
- *DNS hijack* — "DNS is being redirected (portal)" → Open portal.
- *DNS broken* — "DNS not resolving (hotel DNS down); 1.1.1.1 works" → use
  override / report.
- *HTTP intercepted* — "Traffic is being intercepted before it leaves" → Open
  portal.
- *sudo declined* — "Reset needs admin; it was declined — nothing changed" →
  retry with auth.
- *command timed out* — "<op> timed out; the system was left unchanged" → retry.
- *partial mutation* — "<op> partially applied: <what>; rolled back to <state>"
  → review.
- *missing speedtest-go* — "speedtest-go not installed" → install hint.
- *no wifi hardware* (desktop) — wifi rows hidden; ethernet-only view.
- *wifi rfkill-blocked* — "WiFi is blocked (rfkill)" → unblock. The indicator
  detects a soft-blocked radio (=rfkill list= shows the radio off though hardware
  is present) and shows this distinct from disconnected. =net repair rfkill= (and
  =net doctor --fix= as its first step) runs =rfkill unblock wifi= + =nmcli radio
  wifi on= and reconnects. This is the framework-laptop case: an out-of-power
  shutdown sometimes leaves wifi soft-blocked at next boot, and yes — the module
  recovers it (the rfkill state is the indicator; the rfkill repair / doctor is
  the one-step fix). A *hard* block (physical switch) is reported as
  not-recoverable-in-software with that message.
- *wifi rfkill hard-blocked* — "WiFi is blocked by the hardware switch" →
  evidence: rfkill hard state → flip the physical switch.
- *wrong password / missing secret* — "Saved password for <SSID> was rejected" →
  evidence: SSID, NM auth-failure reason → re-enter the password.
- *enterprise auth/cert failure* — "Enterprise login failed for <SSID> (802.1X)"
  → evidence: SSID, EAP failure reason → edit the profile in nmtui/nmcli.
- *upstream / AP / provider* — "On <SSID>, link is fine but the network has no
  uplink" → evidence: gateway reachable, no route out, not a portal → switch
  network or contact the venue.
- *VPN-routed* — "Connected; internet is routed through a VPN (<dev>)" →
  evidence: default route on a tun/wg device or non-NM DNS owner → check the VPN,
  not WiFi.
- *HTTP interception, no parseable portal URL* — "A portal is intercepting
  traffic but didn't give a login link" → evidence: HTTP code, redirect host →
  opens neverssl + the gateway page to log in manually.
- *DNS override cleanup unverified* — "Couldn't confirm DNS was restored after the
  test" → evidence: iface, attempted revert → revert DNS manually
  (=resolvectl revert <iface>=).

Each message names whether the system was left unchanged, partially changed (with
what), or fully changed, so the user knows the residue.

* Doctor: escalation, classification, terminal states

=net doctor= diagnoses, classifies the failure, then (with =--fix=) applies the
*lightest* repair that fits and re-checks — it never loops destructive repairs
against a failure they can't fix. Each failure resolves to one of four outcomes,
and the doctor stops at any terminal one:

- =fixable= — a local repair should help. Escalate lightest-first: rfkill-unblock
  → reset (fresh MAC) → bounce (full stack) → portal, re-probing after each, and
  stop as soon as the probe returns online.
- =needs-user-action= (terminal) — no reset/bounce will help; doctor stops and
  names the exact next step. Covers: wrong WPA password / missing NM secret
  (enter the password), locked keyring or polkit denial (retry with auth),
  enterprise 802.1X cert/identity failure (edit the profile in =nmtui=/=nmcli=),
  captive portal login-required (open the portal + accept terms). Doctor must not
  delete/recreate the profile against these — that loses the saved password and
  makes things worse.
- =upstream-not-local= (terminal) — the local link is up but the problem is past
  it: AP has no uplink, gateway down/dropping traffic, DHCP server broken, ISP
  outage, portal backend failing. =diagnose= proves it (link up + IP + gateway
  reachable, but no route out and not a captive redirect), and =doctor --fix=
  stops after local repairs are exhausted with "local repairs tried; likely
  upstream/AP/provider" + the evidence. Next action: switch networks or contact
  the venue.
- =deferred/vpn= (terminal for v1) — an active VPN / policy route / non-NM
  resolver owns the default route or DNS, so "no internet" may be the VPN's fault,
  not WiFi's. v1 *detects* this (default route on a =tun/wg= device, or DNS owned
  by something other than the NM link) and classifies it separately — "link is
  fine; internet is VPN-routed" — rather than misclassifying it as a WiFi failure.
  v1 does not repair it (VPN management is Phase 5); it names the VPN as the likely
  owner and stops.

** DNS handling in doctor (explicit per class)
- *Captive DNS hijack* — open the portal (the hijack clears on login). No DNS
  mutation.
- *Broken resolver, 1.1.1.1 works* — doctor offers an explicit *temporary* 1.1.1.1
  override as a repair with cleanup verification (auto-revert, =cleanup_verified=);
  without =--fix= it only recommends the command. It does not leave a permanent
  resolver change.
- *Port-53 / egress blocked* (even 1.1.1.1 fails) — terminal =upstream-not-local=;
  doctor stops, since it's not locally fixable.

* Failure-mode coverage

For each common field failure: does =net diagnose= detect it, can =net doctor
--fix= repair it, and what terminal user action remains when it can't. (The
=needs-user-action= / =upstream-not-local= / =deferred/vpn= outcomes are defined
above.)

| Failure mode               | diagnose detects            | doctor --fix                | terminal user action        |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| rfkill soft block          | yes                         | yes (unblock)               | none                        |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| rfkill hard block          | yes                         | no                          | flip the physical switch    |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| no wifi hardware           | yes                         | n/a                         | use ethernet                |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| associated, no DHCP        | yes                         | yes (reset/bounce)          | none, else switch network   |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| gateway unreachable        | yes                         | yes (bounce)                | switch network if it        |
|                            |                             |                             | persists                    |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| captive DNS hijack         | yes                         | opens portal                | log in at the portal        |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| broken DNS, 1.1.1.1 works  | yes                         | yes (temp override,         | report the venue's DNS      |
|                            |                             | auto-reverted)              |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| HTTP captive portal        | yes                         | opens portal                | log in at the portal        |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| HTTP interception, no      | yes                         | opens neverssl + gateway    | log in manually             |
| parseable URL              |                             |                             |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| upstream / AP outage       | yes (link up, no route out) | no (stops after local)      | switch network / contact    |
|                            |                             |                             | venue                       |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| wrong WPA password /       | yes                         | no                          | enter the password          |
| missing secret             |                             |                             |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| enterprise auth / cert     | yes                         | no                          | edit the profile in         |
| failure                    |                             |                             | nmtui/nmcli                 |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| duplicate SSID /           | yes (UUID-keyed)            | yes (activate by UUID)      | none                        |
| connection-name            |                             |                             |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| hidden SSID                | yes                         | yes (connect by name)       | enter SSID + password       |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| multiple active links      | yes                         | n/a                         | pick the interface          |
| (wifi+tether)              |                             |                             |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| wedged NetworkManager      | yes                         | yes (bounce → restart NM)   | none, else reboot           |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| slow / hung command        | yes (degraded)              | retries within budget       | retry                       |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| stale / corrupt cache      | yes                         | self-heals (atomic +        | none                        |
|                            |                             | invalidation)               |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| DNS cleanup failure        | yes                         | flags cleanup-unverified    | revert DNS manually         |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| missing speedtest backend  | yes                         | n/a                         | install speedtest-go        |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|
| VPN / policy-routing       | yes (route/DNS ownership)   | no (deferred to Phase 5)    | check the VPN               |
| interference               |                             |                             |                             |
|----------------------------+-----------------------------+-----------------------------+-----------------------------|

* Observability — logging + redaction

- *Event log*: JSONL at =$XDG_STATE_HOME/net/events.jsonl= (fallback
  =~/.local/state/net/events.jsonl=), size-rotated (e.g. 1 MB × 3). Every
  mutating op and probe appends an event: =ts, op, argv (redacted), exit_code,
  stderr_tail, elapsed_ms, iface, nm_uuid, probe_url_class, http_code,
  redirect_host, cache_event=.
- *Redaction (always on)*: PSKs, EAP identities/passwords, NM secrets, and
  portal query tokens are never logged. MAC addresses, full IPs, and SSID are
  redacted when configured (=redact_mac=, =redact_ip=, =redact_ssid= in config).
- *Post-failure diagnosis*: =net doctor --json= is the snapshot + recommendation
  (diagnose plus the suggested repair), =net diagnose --json= the raw report, and
  the event log the history. =net doctor= is the console-recoverable entry point
  (reachable as =make online= / =make net-doctor=).
- *Secret-leak tests*: assert no PSK/EAP/portal-token ever appears in any JSON
  output, log line, or error message.

* Indicator (task #C — Phase 1, the fast win)

** States (internet sub-state on top of link state)
- online — associated and the probe returned 204. Normal icon.
- captive — associated, probe hit a portal. Distinct glyph + warning CSS class;
  tooltip names the portal host; left-click opens diagnostics with the portal
  ready to open (Phase 2+; see interactions for the Phase-1 interim).
- no-internet — associated, probe failed (no portal, no 204). Distinct glyph +
  warning class.
- degraded — =net status= couldn't read link state within budget (slow/failed
  nmcli). Neutral glyph, =net-degraded= class. Never blocks the bar.
- rfkill-blocked — the radio is soft-blocked (=rfkill=), distinct from
  disconnected. Distinct glyph; the fix is =net repair rfkill= / =net doctor=.
- connecting / disconnected / airplane / wired — as today, plus wired shown
  correctly even when it appears after session start. (airplane is now this
  module's state, absorbed from the retired airplane module.)

** Glyphs
Nerd-font codepoints, final values verified live before merge (same discipline
as wtimer). Reuse the signal-strength ramp already in =waybar-netspeed=; add a
captive / no-internet / degraded overlay glyph.

** Tooltip
SSID + signal + IPv4 + gateway + the throughput readout (absorbed from
netspeed) + the last probe result and its age (stale/expired hinted).

** Interactions (phase-aware; no keyboard-modifier clicks — waybar can't qualify
clicks by modifier, so the rich actions live in the panel, not ctrl/super-click)
Clicks never block the bar: each dispatches a detached background job and reports
via =notify=, single-flight per action.
- *Phase 1 (no panel yet)*: left-click runs =net probe= + notify (refreshes the
  state on demand) and keeps the existing =pypr toggle network= scratchpad as the
  interim manager; right-click runs =net repair reset= in the background +
  notify; middle-click runs =net portal=.
- *Phase 2+ (panel exists)*: left-click opens the panel (focused on the relevant
  section — diagnostics when captive); right/middle keep the background
  reset/portal shortcuts.

* Panel (tasks #B + #C diagnostics — Phases 2-3)

GTK4 + gtk4-layer-shell, pocketbook scaffold (src-layout package, unittest,
Makefile, gtk4-layer-shell anchored dropdown under the bar). One panel shell,
reused by the future desktop-settings panel.

Sections:
1. *Connections* — list, MRU-first, active marked, live signal bars for in-range
   wifi; row click switches; buttons for add / edit / remove; a rescan control.
2. *Diagnose* (read-only) — Probe (204/captive, shows portal URL + Open), Gateway
   ping, DNS config. Streaming step output (the diagnostics contract).
3. *Repair* (mutating, confirmed) — tiered lightest-first: Unblock rfkill, Reset
   (fresh MAC), Bounce (full stack), DNS override test, Force portal. A "Get me
   online" button runs =net doctor --fix= (the auto-escalating sequence).
4. *Speed test* — Run button, progress, down/up/ping result + last-run line.

** Panel state, cancellation, permissions
State machines for: connection-list loading, rescan-in-progress,
activation-in-progress, diagnose-running, repair-running, speedtest-running. Plus
the real terminal states on this two-machine fleet: no-wifi-hardware (desktop →
ethernet-only view) and missing speedtest-go. (No GPG-key state — there's no
credential store; secrets live in NM.) ("No NetworkManager" is not a modeled
state — NM is always present
on these machines; if nmcli is somehow absent the panel shows a single hard-error
and exits.) Long operations show elapsed time and are cancellable where the
underlying op allows (rescan, speedtest, probe); clearly non-cancellable ones
(an in-flight activation) show elapsed + a disabled control. Permission-denied
(sudo/polkit declined) is a first-class outcome with the "nothing changed"
message, never a silent failure.

Interaction-pattern catalog (=~/code/rulesets/patterns/=) principles that apply:
- transient-state-buttons — all the network levers in one place, reachable by
  one chord (the bar click), state visible.
- default-most-common-friction-proportional — connections MRU-ordered so the
  common pick is first; destructive ops (remove) and privacy-changing ones
  (reset, override) get a confirm, switching does not.
- one-prompt-picker-typed-prefix — if the connection picker ever goes
  keyboard-driven, kind (wifi/eth/saved/in-range) + name in one typed picker.

** Panel UX flow (settle before Phase 2)
The concrete interaction defaults, so the GTK build isn't inventing them:
- *Default focus*: the Connections section, current connection's row selected. If
  the indicator opened the panel because of a captive/no-internet state, focus
  Diagnose instead with the relevant action highlighted.
- *Row content*: glyph (signal bars / wired / active check) + name + a secondary
  line (security type, "active"/last-used). The active row is visually pinned at
  top of its group.
- *Buttons*: one *primary* per section (Connections: Connect to the selected row;
  Diagnose: Run diagnose; Repair: "Get me online"; Speed test: Run). Secondary
  actions (add / edit / remove / rescan; individual repair tiers) are smaller and
  grouped.
- *Disabled rules*: Connect disabled on the already-active row; Repair tiers
  disabled while one runs; Speed test disabled while running; add/edit disabled
  for enterprise (with the "edit in nmtui/nmcli" hint).
- *Confirmations* (exact wording): Reset → "Reset <SSID>? This drops the
  connection and reconnects with a new MAC."; Bounce → "Restart networking? All
  links drop briefly."; DNS override → "Temporarily set DNS to 1.1.1.1 for the
  test? It reverts automatically."; Remove → "Forget <SSID>? The saved password is
  deleted."
- *"Get me online" reporting*: shows each escalation step live (Unblock rfkill →
  Reset → Bounce → Portal) with per-step pass/fail and stops at the first that
  restores internet or at a terminal state, naming the next action.
- *After close*: the bar reflects the new state immediately (signal/refresh on
  next poll); a running speedtest/diagnose keeps running and notifies on finish
  (panel close doesn't cancel it).
- *Keyboard*: Esc closes; Tab moves between sections; arrows move rows; Enter
  fires the section primary; the connection list is type-to-filter.

* Connection management (nmcli)

- Every op via nmcli per the nmcli contract above (terse, escaped, UUID-keyed,
  bounded =--wait=).
- MRU ordering from NM's =connection.timestamp= (last activated), descending.
- Ethernet appears in the list whenever a wired device is present, selectable at
  any time; switching just brings the chosen connection up.
- *Mutation safety + rollback*: switching keeps the current connection up until
  the new one activates successfully (=nmcli --wait 30=); on failure it does not
  tear down the working link, surfaces the failure, and leaves the prior
  connection active. =net down= notes that NM may auto-reactivate a profile and
  reports the post-op active connection so the user isn't surprised. A switch that
  needs a password it doesn't have prompts (or fails with "password required"),
  never silently strands. The exact NM command sequence (preflight active-state
  read → activate target → verify default route → on failure, confirm prior
  still up) is pinned in the engine and tested against fake nmcli.
- *Add/edit scope*: open + WPA-PSK only in v1. Existing saved profiles of any
  type (including enterprise) can be *activated*; editing an enterprise profile
  shows "edit via nmtui/nmcli" rather than a broken partial form.

* Connection secrets (no separate store)

Per Craig's call: don't build a parallel credential store. Settings and secrets
live where NetworkManager already keeps them, so there's one source of truth and
no extra dependency (no GPG, no gpg-agent, no =~/.config/net/connections=).

- *Where secrets live*: =/etc/NetworkManager/system-connections/<name>.nmconnection=,
  root-owned =0600=, with the PSK/EAP secret stored inline (the default
  =secret-flags=0= "owned by NM"). That's already secure-at-rest (root-only) and
  is what =nmcli= reads/writes.
- *How we touch them*: every add/edit/remove goes through =nmcli= (=connection add
  / modify / delete=), which writes the =.nmconnection= with the right ownership
  and perms. We never read or write =system-connections= files directly (root) and
  never copy a secret out of them.
- *No export / import / sync* — there's nothing to sync. A new machine gets its
  connections the way it always has (the user joins, or restores NM profiles),
  not from a tool-specific vault.
- *config file*: =~/.config/net/config= still exists, but only for non-secret
  preferences (speedtest server, redaction flags, probe TTL). It holds no
  credentials.
- *No secret leakage*: PSK/EAP never appear in =net=' =--json= output, the event
  log, or error text (tested) — even though NM is the store, our surfaces must not
  echo a secret =nmcli= happens to return.

* Speed test

- Backend: *=speedtest-go=* (=--json=, =--server=, =--no-download/--no-upload=),
  already installed on velox (AUR =speedtest-go-bin=). No new dependency for v1.
  librespeed-cli is the documented fallback for a self-hosted LibreSpeed server.
- =net speedtest --json= parses speedtest-go's JSON into the =speedtest= schema.
- *Server policy*: auto-select nearest by default; allow a pinned server id in
  =~/.config/net/config=.
- *Timeout + cancellation*: a hard run timeout (e.g. 60s); the panel run is
  cancellable (kills the child). Offline / rate-limited / no-server errors map to
  the failure-message table.
- *Tests*: fixture JSON (success) and fixture stderr (offline, no server,
  malformed output) drive =net speedtest= parsing without touching the network.

* Help + documentation

In-app help has three layers, each reachable in the situation it's needed:

- *CLI help (works from a dead-GUI TTY)*: =net --help= lists the subcommands in
  one screen; =net <cmd> --help= documents each (flags, what it mutates, the
  console-recovery targets). The Makefile targets are self-describing (=make help=
  lists =online= / =net-doctor= / etc. with one-line descriptions). This is the
  layer that matters most when you're at a console with no network.
- *Panel help (in the GUI)*: a small =?= affordance in the panel header opens an
  inline help pane — what each section does, which Repair actions mutate state,
  what the indicator glyphs/colors mean. Per-control tooltips on the less-obvious
  buttons (rfkill, bounce, DNS override). No external help browser.
- *User guide (the durable doc)*: a README / docs page covering every command,
  the indicator states + glyphs, the panel sections, the config file keys, the
  recovery make targets, troubleshooting (the failure-message table), and
  rollback. Written so a future session — or Craig six months out — can operate
  and recover the module from the doc alone.

The failure-message table above is the single source of truth for the
troubleshooting text; the guide and the panel help both render from it rather
than restating it.

* Enhancement radar

Low-cost adjacent affordances, each dispositioned so cheap wins aren't lost and
the v1 panel stays focused. (Several are already in v1 by virtue of other
sections; marked here so the consideration is visible.)

| Enhancement                         | Disposition | Reason                                                 |
|-------------------------------------+-------------+--------------------------------------------------------|
| Open / copy portal URL              | v1          | already in the captive flow; trivial Open + Copy       |
|-------------------------------------+-------------+--------------------------------------------------------|
| Forget network                      | v1          | it's the remove op, already specced                    |
|-------------------------------------+-------------+--------------------------------------------------------|
| Rescan now                          | v1          | already a Connections control                          |
|-------------------------------------+-------------+--------------------------------------------------------|
| Retry with hardware MAC             | v1          | captive already has --hardware-mac; expose in Repair   |
|-------------------------------------+-------------+--------------------------------------------------------|
| Pin speedtest server                | v1          | already a config key                                   |
|-------------------------------------+-------------+--------------------------------------------------------|
| Copy redacted doctor report         | v1          | cheap, serves the observability/support goal           |
|-------------------------------------+-------------+--------------------------------------------------------|
| Show last good network / result     | vNext       | needs small history persistence                        |
|-------------------------------------+-------------+--------------------------------------------------------|
| Watch mode for net doctor           | vNext       | a --watch loop; handy at a TTY, not v1-critical        |
|-------------------------------------+-------------+--------------------------------------------------------|
| Actionable desktop notifications    | vNext       | dunst supports actions; extra wiring                   |
|-------------------------------------+-------------+--------------------------------------------------------|
| Keyboard connection picker (fuzzel) | vNext       | the typed-prefix pattern; panel covers v1              |
|-------------------------------------+-------------+--------------------------------------------------------|
| QR-code share / import WiFi         | rejected    | low value for a personal 2-machine setup; phones do QR |
|-------------------------------------+-------------+--------------------------------------------------------|

* Waybar wiring

- Replace =custom/netspeed= with =custom/net= in the bar's module list (same
  slot).
- Module def: =exec: waybar-net=, =return-type: json=, =interval: 2=, a =signal=
  for on-demand refresh (next free signal after wtimer's 14), =on-click=,
  =on-click-right=, =on-click-middle= per the phase-aware interactions (each
  dispatches a detached job, never blocks).
- Remove the old =on-click: pypr toggle network= scratchpad only once the panel
  replaces it (Phase 2); Phase 1 keeps it as the interim manager.

* Testing plan (TDD)

- *Engine (normal)* — fake =nmcli= + =curl= + =speedtest-go= on PATH; assert
  command sequences and parsed/emitted JSON for status, list, up/down,
  add/edit/remove, probe, diagnose, repair, speedtest. Pure state/format
  functions tested directly. JSON schemas locked by example.
- *Portal parser* — already covered in =tests/captive= (Normal/Boundary/Error +
  the real SONIFI body). The engine's native probe reuses the same cases.
- *nmcli parsing* — escaped colon/backslash/newline in SSID, duplicate names,
  hidden SSID, non-ASCII, wired-mid-session, multi-active (wifi+tether).
- *Failure + concurrency (the risky classes)* — slow/hung nmcli/curl/speedtest
  (degraded state within budget), concurrent =net status= probe refresh
  (single-flight), corrupt cache (recovered), stale cache after SSID change
  (invalidated), permission denied / sudo declined, DNS-override cleanup failure
  (=cleanup-unverified=), NM partial activation (rollback keeps prior link),
  secret redaction, missing speedtest-go, no wifi hardware, rfkill soft/hard
  block.
- *Doctor classification* — fixture-driven =net doctor= over fake nmcli/curl
  asserting the right terminal classification + that =--fix= stops before
  destructive repairs: auth failures (=needs-user-action=), upstream/AP failure
  (=upstream-not-local=), VPN-routed failure (=deferred/vpn=), and the DNS classes
  (hijack → portal, broken-but-1.1.1.1-works → offered override, egress-blocked →
  upstream). Assert the failure-mode coverage table's "detects / repairs / terminal
  action" holds for each row.
- *Indicator* — drive =net status --json= through =waybar-net=, assert the JSON
  per state (online / captive / no-internet / degraded / wired / disconnected /
  rfkill), iface override via env.
- *Panel* — pocketbook-style: backing logic (list ordering, op dispatch,
  state-machine transitions), not GTK widgets.
- *NM secrets / no-leak* — add/edit writes the secret into NM via nmcli (asserted
  against fake nmcli, never to a tool-owned file); assert no PSK/EAP appears in any
  =--json=, log line, or error (there is no credential store to round-trip).
- *Live checklist (gated out of the suite)* — a "Manual testing and validation"
  task per phase for the real-network states (captive at a hotel, no-internet,
  switch under load, reset, speedtest) that can't be faked.

** Harness + coverage gate
The concrete contract, matching the repo's existing convention (not pytest — the
dotfiles suites are =unittest=, run by =make test= as =python3 -m unittest= over
=tests/*/test_*.py=; 33 suites today):
- *Framework*: =unittest=. Each suite is =tests/<name>/test_<name>.py=
  (=tests/net/=, =tests/waybar-net/=), collected by the existing =make test= loop
  — no new runner, no pytest dependency.
- *Fakes on a temp PATH*: =fake-nmcli=, =fake-curl=, =fake-speedtest-go=,
  =fake-rfkill=, =fake-resolvectl= live as executable stubs in =tests/<name>/=
  (the =tests/layout-navigate/fake-hyprctl= pattern). A fixture file encodes the
  command→canned-output map and the stub appends each invocation to a log the test
  asserts against. Subprocess timeouts are simulated by a stub that sleeps past the
  budget; =net status= must still return the degraded state.
- *Waybar wrappers end-to-end*: =waybar-net= is run as a subprocess with the fake
  PATH and the env overrides (iface, cache path), asserting the emitted JSON — same
  as =tests/waybar-netspeed=.
- *Coverage*: coverage.py is absent system-wide (and not importable), so coverage
  runs in a throwaway venv (=python3 -m venv=, =pip install coverage=, =coverage
  run -m unittest=, =coverage report=) — the method the wtimer suite used (95%).
  Target: *branch* coverage over =net/= and the wrapper, ≥ 90% on the pure
  classifier/parser modules.

** Coverage as a gap-finder, not a number (per phase)
Line coverage alone misses the branches that matter here, so each phase ends with
a *coverage-gap pass*, not just a percentage:
- After the first green run, read the branch report and map every uncovered branch
  to either a new test or a consciously-excluded live-only behavior (with a comment
  or a Manual-testing entry naming it).
- *Branch coverage is required* for the pure logic: the doctor classifier (every
  outcome — fixable / needs-user-action / upstream-not-local / deferred-vpn), the
  cleanup-unverified path, the redaction paths, the degraded hot-path fallback, the
  timeout branches, and the portal/nmcli parsers.
- A phase isn't "done" until its coverage-gap pass is recorded — uncovered logic is
  either tested or explicitly excused, never silently uncovered.

* Files touched (planned, all in =~/.dotfiles=)

- =net/= package (src-layout, like pocketbook) — engine + panel.
- =hyprland/.local/bin/waybar-net= — the indicator (replaces =waybar-netspeed=).
- =hyprland/.local/bin/net= — engine CLI entry (console-script shim).
- =hyprland/.config/waybar/config= — swap =custom/netspeed= → =custom/net=;
  remove =custom/airplane=.
- =hyprland/.config/waybar/style.css= — captive / no-internet / degraded /
  rfkill classes; remove airplane classes.
- =tests/net/=, =tests/waybar-net/= — suites.
- =captive= — refactor: extract probe + reset into functions callable
  non-interactively (a =--json= probe mode) so the engine reuses them.
- =~/.config/net/config= — seed config (probe TTL, speedtest server, redaction
  flags). No secrets; not a credential store.
- dotfiles =Makefile= — add the console-recovery targets (=online=, =net-doctor=,
  =net-status=, =net-diagnose=, =net-portal=, =net-reset=, =net-bounce=).
- *Deletions once net ships* (the airplane module is absorbed):
  =hyprland/.local/bin/waybar-airplane=, =hyprland/.local/bin/airplane-mode=,
  =tests/waybar-airplane/=, =tests/airplane-mode/=, and the =custom/airplane=
  module + its css.
- archsetup Hyprland step — add =gtk4-layer-shell=, =python-gobject=,
  =speedtest-go-bin= to the install lists (the only archsetup change; no =gpg=
  added, secrets stay in NM's store).

* Resolved decisions (Craig's calls + this response)

1. Panel UI tech → GTK4 + gtk4-layer-shell, shared pocketbook scaffold (one
   panel shell, reused by the desktop-settings sibling).
2. Engine language → Python =net= package; shells out to =captive= for the
   portal-force flow, native cheap probe for the bar path.
3. Connectivity probe → split cadence (fast link poll every 2s + slow cached
   internet/captive probe, TTL ~45s) with single-flight + atomic cache.
4. No keyboard-modifier clicks (waybar can't qualify them) — the panel hosts the
   rich actions; bar clicks dispatch detached jobs (phase-aware).
5. No separate credential store (Craig's call, cj). Secrets live in NM's own
   =system-connections= (root =0600=, inline), touched via nmcli. No GPG, no
   gpg-agent, no =~/.config/net/connections=. Supersedes the earlier GPG-store
   design.
6. =custom/netspeed= absorbed into =custom/net=; throughput moves to the tooltip.
7. Speed-test backend → =speedtest-go= (already installed), not a new
   librespeed-cli dependency; librespeed-cli is the self-hosted fallback.
8. Code lives in the dotfiles repo; archsetup only installs deps.
9. v1 add/edit scope = open + WPA-PSK; enterprise/802.1X is activate-only,
   add/edit is vNext (settled by Craig 2026-06-29 — no enterprise networks in his
   history, so the form would be unused UI).
10. =net doctor= is in v1 (Craig's call, cj) — a one-shot diagnose+fix mode,
    reachable from a TTY via =make online= / =make net-doctor=. (The earlier
    "defer the doctor/bundle command" decision is reversed.)
11. Diagnose (read-only) and Repair (mutating, confirmed) are separated in the
    panel and the CLI; Repair is tiered lightest-first (rfkill → reset → bounce).
12. =custom/net= absorbs the airplane module (Craig's call, cj). *As built
    (2026-06-29, option 1): display-only.* net shows the airplane state (reads
    the airplane-mode state file); the =airplane-mode= low-power toggle is kept
    (radios + CPU + brightness + services is not a network concern) and moved to
    =custom/net='s right-click + signal 15. Only the redundant display pieces —
    =waybar-airplane=, =custom/airplane=, and the retired =waybar-netspeed= —
    plus their tests/css were deleted. The earlier "delete airplane-mode" framing
    is superseded.
13. Repair includes a full-stack bounce and an rfkill-unblock (Craig's calls,
    cj) — the latter recovers the framework-laptop post-power-loss soft-block.
14. VPN / WireGuard is a planned Phase 5 (Craig's call, cj), not a permanent
    exclusion.

* Implementation phases

- *Phase 1 — Indicator + console recovery (task #C).* =net status= + =net probe=
  (native cheap probe, reusing captive's logic) + the =captive= probe refactor +
  =waybar-net= + the split-cadence cache (single-flight, atomic, stale classes) +
  CSS states (incl. rfkill) + performance budget. Plus the CLI-only recovery path:
  =net repair= tiers (rfkill / reset / bounce), =net doctor [--fix]=, and the
  Makefile targets (=make online= etc.) — all testable without the GTK panel.
  Absorbs the airplane state and removes the standalone airplane module. Interim
  left-click keeps the existing scratchpad until the panel lands.
  - *Acceptance*: fresh-login waybar smoke test shows correct state on
    online/captive/no-internet/wired/rfkill; =net status= stays within budget
    under a fake slow nmcli (degraded state); =net doctor --fix= recovers a
    soft-blocked radio from a TTY; the live captive checklist passes at a real
    portal; the airplane state works and the old airplane module is gone;
    reverting = swap =custom/netspeed= + =custom/airplane= back.
- *Phase 2 — Panel shell + connection management (task #B core).* GTK4
  layer-shell scaffold + =net list/up/down/add/edit/remove/rescan= + MRU list +
  mutation safety/rollback + panel state machines.
  - *Acceptance*: switch wifi↔wifi and ethernet↔wifi without stranding; a failed
    switch leaves the prior link up; add/edit open + WPA-PSK writes the secret to
    NM; remove confirms; panel states render for loading/rescan/activation.
- *Phase 3 — Diagnostics + speed test in the panel.* Wire =net diagnose= /
  =net repair= / =net doctor= / =net portal= / =net speedtest= into the Diagnose
  vs Repair sections; the "Get me online" button; portal Open button; speedtest
  progress + cancel.
  - *Acceptance*: diagnose runs read-only; each repair tier confirms + verifies
    cleanup (DNS override reverts, shown); speedtest result parses from
    speedtest-go and a fixture-driven failure shows the right message.
- *Phase 4 — Docs + rollout.* In-app help (=net --help= / per-command help, the
  panel help affordance), README/user-guide (commands, panel, config,
  troubleshooting, the make targets, rollback), and the manual dep step on ratio.
  - *Acceptance*: =net --help= and each subcommand's help are complete; the
    user-guide covers every command + the recovery targets; ratio rollout
    documented.
- *Phase 5 — VPN / WireGuard (future).* Fold the existing archsetup wireguard
  tooling into the same panel + CLI (=net vpn ...=). Out of the v1 milestone;
  specced separately when picked up.

* Open items / risks

- gtk4-layer-shell dropdown anchoring under a waybar module needs the same
  positioning work pocketbook solved; reuse it. (Phase 2.)
- The =captive= refactor must keep the standalone CLI behavior identical while
  exposing a non-interactive =--json= probe; covered by the existing
  =tests/captive= suite plus new probe-mode tests. (Phase 1.)
- speedtest-go server selection variance (nearest-server flor) — pin a server in
  config if results are noisy. (Phase 3.)
- The background-probe kick from =net status= must be truly non-blocking (spawn +
  detach); enforced by the single-flight lock and the performance benchmark test.

* Rollback

Each phase is independent. The indicator (Phase 1) is a drop-in replacement for
=custom/netspeed= (and =custom/airplane=); reverting is swapping those modules
back in the config and restoring their scripts. The panel is additive — not
wiring its clicks leaves the bar working as before. No credential store to roll
back (secrets stay in NM throughout).

* Review findings [31/31]

** DONE Define the structured diagnostics contract :blocking:
The spec says the engine "emits JSON" and that diagnostics "reuse =captive=
verbatim", but the current =~/.dotfiles/common/.local/bin/captive= flow is a
human-readable bash script that mixes diagnostics, sudo prompts, DNS mutation,
browser launch, and terminal prose. A GTK panel cannot reliably turn that into
clear state, progress, cancellation, or useful error messages. Define the
machine contract before implementation: every diagnostic step should have a
stable id, status (=pending/running/pass/warn/fail/skipped=), redacted evidence,
elapsed time, safety outcome, and next action. Keep =captive= as the interactive
CLI, but either refactor reusable probe/reset functions behind =net diagnose
--json= or make =captive= expose a non-interactive JSON mode. This blocks the
panel and logging work because otherwise the implementer must invent the
boundary.

Disposition: accept — added the "Diagnostics contract" section (per-step id /
status / evidence / elapsed / safety / next_action) and the =captive= =--json=
probe-mode refactor under Architecture + Files touched.

** DONE Specify user-facing failure messages and recovery actions :blocking:
The spec names failure states like =no-internet=, =captive=, failed probe,
failed reset, missing DNS, and missing speed-test backend, but it does not define
the messages the user sees or what each message tells them to do next. For this
feature, "error" is not enough: a user needs to know whether WiFi is associated,
whether DHCP succeeded, whether DNS is hijacked/broken, whether HTTP is
intercepted, whether sudo was declined, whether a command timed out, and whether
the system was left unchanged or partially changed. Add a message table for the
indicator, panel, and CLI with: failure class, visible text, evidence included,
redaction rule, and next action. This is blocking because UX quality here is the
product, not an implementation detail.

Disposition: accept — added the "Failure states, messages, recovery" section
covering each class, the visible message, the "what changed" residue note, and
the next action across indicator/panel/CLI.

** DONE Define the debug log and redacted support bundle :blocking:
There is no observability section. When this fails in a hotel or cafe, an agent
needs enough evidence to diagnose it without rerunning destructive actions. Add
log location, rotation/retention, JSONL event schema, command argv logging,
exit-code/stderr capture, elapsed time, selected iface, NM active connection
UUID, probe URL class, HTTP code, redirect host, DNS servers, and cache
read/write events. Also define a =net doctor --json= or =net debug-bundle=
command that emits redacted status, recent log events, dependency versions, and
a reproduction command. Redact SSID if configured, MAC addresses, portal query
tokens, PSKs, EAP identities/passwords, IPs when requested, and all GPG/NM
secrets. This blocks implementation readiness because post-failure diagnosis is
currently left to ad hoc terminal spelunking.

Disposition: modify — accepted the JSONL event log, the schema, and the redaction
rules in full (new "Observability" section). Deferred the dedicated =net
debug-bundle= / =net doctor= command to vNext: for a single-user tool =net
diagnose --json= (the snapshot) plus the event log (the history) cover
post-failure diagnosis; a bundle command is gold-plating for v1. Recorded under
Out + Resolved decision 10.

** DONE Pin the nmcli parsing and timeout contract :blocking:
The spec lists nmcli operations but not the exact fields, output modes, escaping
rules, ID semantics, or timeouts. This is risky because SSIDs and connection
names can contain spaces, colons, duplicates, hidden names, and non-ASCII; the
current =waybar-netspeed= already had an SSID parsing bug. The nmcli manual
documents =--terse=, =--get-values=, =--escape=, =--wait=, ID/UUID/path
selection, =passwd-file=, and built-in connectivity states
(=none/portal/limited/full/unknown=) at
https://man.archlinux.org/man/nmcli.1.en. The spec should require UUIDs for
saved-profile operations, explicit =--wait= budgets, parser tests for escaped
colons/backslashes/newlines/duplicate names/hidden SSIDs, and a decision on when
to use or ignore =nmcli networking connectivity [check]=. This is blocking
because the command wrapper is the core reliability boundary.

Disposition: accept — added the "nmcli contract" section: terse + =--escape= +
=--get-values=, UUID-keyed ops, explicit =--wait= budgets, NM connectivity as a
cheap hint (our probe authoritative), and the parser test matrix.

** DONE Define cache concurrency, atomicity, and stale-state behavior :blocking:
=net status= may spawn =net probe= whenever the cache is stale, but the spec
does not define locking, process coalescing, atomic writes, crash cleanup, or
what happens when the probe hangs. With a 2s Waybar interval, a bad network could
start overlapping probes, corrupt the runtime cache, or keep showing stale
"online" while the link is gone. Add a single-flight lock under
=$XDG_RUNTIME_DIR/waybar=, atomic write+rename for cache updates, max probe
runtime, stale age classes (fresh/stale/expired/unknown), cache invalidation on
iface/SSID/connection UUID change, and tests for concurrent =net status= calls.
This blocks the fast-path design because it is the main performance and
correctness risk.

Disposition: accept — added "Concurrency, atomicity, staleness" under the
Connectivity model: flock single-flight, temp+rename atomic write, ≤6s probe
timeout, fresh/stale/expired/unknown classes, iface/SSID/UUID invalidation, stale
lock reclaim, plus concurrency tests in the test plan.

** DONE Bound hot-path performance with measured budgets :blocking:
The spec says the cheap poll should be sub-100ms, but the proposed fast path
still may call multiple =nmcli= commands every two seconds, read sysfs, parse
throughput, and maybe spawn a background probe. The existing =waybar-netspeed=
had a deliberate sleep for throughput sampling; replacing it must define how
throughput is sampled without sleeping in the bar path. Add a per-command budget
for =waybar-net= and =net status=, a maximum number of subprocesses on the hot
path, a timeout for every subprocess, benchmark tests with fake slow =nmcli=,
and a rule that the indicator emits a degraded JSON state rather than blocking.
This is blocking because Waybar custom modules can visibly freeze or lag when
their exec path stalls.

Disposition: accept — added the "Performance budgets" section: <100ms typical /
<250ms worst, throughput sampled across the poll interval (no in-process sleep),
one nmcli call max on the hot path, timeouts on every subprocess, the degraded
state, and a fake-slow-nmcli benchmark test.

** DONE Make click actions non-blocking and visible :blocking:
Waybar right-click runs =net reset= and middle-click runs =net portal= directly.
Those operations can require sudo, open browsers, mutate DNS, delete/recreate NM
profiles, or hang on network commands, but Waybar click handlers provide no
panel, terminal, progress, or cancellation surface by default. Define whether
right/middle click instead opens the panel focused on the action, dispatches a
background job with notifications, or is removed from v1. If kept, specify
single-flight behavior, how sudo/polkit prompts surface, how success/failure is
reported, and how the user can inspect logs. This blocks UX readiness because
the fastest remediation path is currently the easiest place to hide failure.

Disposition: modify — accepted the concern; made the interactions phase-aware and
non-blocking. Every click dispatches a detached, single-flight background job and
reports via =notify=; sudo surfaces through polkit/the normal prompt; failures go
to the notify + the event log. In Phase 1 (no panel) left-click runs probe +
notify and keeps the scratchpad; from Phase 2 left-click opens the panel focused
on the action. Recorded in the Indicator "Interactions" subsection.

** DONE Specify connection mutation safety and rollback :blocking:
The spec says row click switches connections and remove gets a confirm, but it
does not define what happens when a switch partially succeeds, disconnects the
current working link, needs a password, loses the default route, or triggers
auto-activation. The nmcli manual warns that =connection down= does not prevent
future auto-activation and may internally block a profile until user action.
Define preflight, the exact NM command sequence, whether the old active
connection is kept until the new one proves usable, when rollback is attempted,
how long activation waits, and what the panel says when rollback fails. This is
blocking because the module can strand the user offline.

Disposition: accept — added "Mutation safety + rollback" under Connection
management: keep the prior link up until the target activates (=--wait 30=), no
teardown on failure, password-required surfaced not stranded, =net down= reports
post-op active state + the auto-reactivation caveat, and the pinned NM command
sequence is tested against fake nmcli.

** DONE Define the credential-store security model :blocking:
The GPG store is described as optional and default-unencrypted, but the spec does
not define file modes, schema, secret-source rules, import/export prompts,
recipient verification, stale secret handling, or what is logged. It also says
NM remains source of truth while the user-owned store contains PSK/EAP secrets,
which creates two truth sources for sensitive data. Add a precise schema,
=0600= file creation with parent-dir permissions, encrypted-recipient checks,
plaintext warning text, explicit opt-in flow, redaction requirements, behavior
when NM has a secret not in the store, behavior when the store has a secret NM
rejects, and tests for no secret leakage in JSON/logs/errors. This blocks Phase
4 and the full spec because otherwise the implementer must make security
decisions mid-code.

Disposition: accept — rewrote "Credential storage" with the versioned schema,
=0600= file / =0700= dir, recipient verification on opt-in, the plaintext
warning, secret-source rule (entered/exported, never harvested from root store),
the two-source reconciliation policy (NM wins live, store wins for what NM
lacks, stale-secret flagging), and the no-leak tests.

** DONE Define EAP, enterprise WiFi, and unsupported connection behavior :blocking:
The store says "PSK/EAP" and connection management says add/edit, but there is
no v1 contract for WPA-Enterprise fields, certificates, identity vs anonymous
identity, hidden networks, static IP, proxy settings, metered flags, MAC
randomization, or 802.1X prompt behavior. Either scope v1 to open/WPA-PSK plus
existing saved-profile activation, or define the minimum EAP form and the
unsupported-state messages. This blocks add/edit/import because enterprise WiFi
is too sensitive to hand-wave.

Disposition: modify (scope) — scoped v1 to open + WPA-PSK add/edit, with
*activation* of any existing saved profile (including enterprise). Enterprise /
802.1X add/edit, static-IP, proxy, metered, and MAC-randomization editing are
vNext, shown as "edit via nmtui/nmcli". Recorded in Scope/Out, Connection
management, and Resolved decision 9.

** DONE Split read-only diagnostics from mutating remediation :blocking:
The panel's diagnostics section includes probe, bounce/reset, gateway ping, and
DNS override test in one area, while =captive= currently performs resets and
temporary DNS changes as part of its flow. Users need to know which buttons are
read-only and which mutate NM profiles, MAC mode, DNS, or browser state. Add
separate "Diagnose" and "Repair" actions, confirmations for destructive or
privacy-changing operations, explicit cleanup verification for DNS override, and
a terminal state when cleanup is unverified. This blocks readiness because
network repair must not surprise the user or leave hidden residue.

Disposition: accept — split the panel into a read-only Diagnose section and a
confirmed, mutating Repair section (and split the CLI into =net diagnose= vs =net
repair=). Added =cleanup_verified= + a terminal =cleanup-unverified= state to the
diagnostics contract.

** DONE Define panel state, cancellation, and permissions UX :blocking:
The panel sections list buttons and a streaming output area, but not loading
states, disabled states, empty states, keyboard/focus behavior, cancellation, or
permission-denied handling. Add panel state machines for connection list loading,
rescan in progress, activation in progress, diagnostics running, speedtest
running, and no NetworkManager/no WiFi/no permissions/no GPG key/no
librespeed-cli. Each long operation should be cancellable where possible or
clearly non-cancellable with an elapsed-time display. This blocks the GTK work
because without it the implementer must invent the user flow.

Disposition: modify — accepted the state-machine requirement (added "Panel state,
cancellation, permissions"), but scoped the state set to what can actually occur
on the two-machine fleet: dropped "no NetworkManager" as a modeled state (NM is
always present; a missing nmcli is a single hard-error exit) and kept
no-wifi-hardware, missing speedtest-go, no-GPG-key, plus the in-progress states
with elapsed-time + cancellation where the op allows.

** DONE Verify speed-test dependency, server choice, and failure contract :blocking:
The spec chooses =librespeed-cli= and notes availability/default-server research
as an open risk, but Phase 3 still depends on parsing its JSON and showing
progress. I checked the upstream project page
(https://github.com/librespeed/speedtest-cli) and the AUR URL named by search is
not sufficient as a verified package/install contract in this spec. Add the
exact package name/source to install, command version expected, JSON shape,
server-selection policy, timeout, cancellation behavior, offline/rate-limited
messages, and tests with fixture JSON and fixture stderr. This blocks Phase 3
because speed-test failure modes are otherwise undefined.

Disposition: modify — verified live and changed the backend: =speedtest-go= (AUR
=speedtest-go-bin=, 1.x) is already installed on velox and supports =--json=,
=--server=, =--no-download/--no-upload=, so v1 needs no new dependency.
librespeed-cli (AUR =librespeed-cli= / =-bin=) is the documented self-hosted
fallback. Added the "Speed test" section with server policy, timeout,
cancellation, the failure-message mapping, and fixture-JSON/stderr tests.

** DONE Define dependency installation and repo boundaries :blocking:
The files touched section alternates between archsetup paths and the external
dotfiles repo, while pocketbook has been folded into this repo and its previous
archsetup provisioning was intentionally removed. The spec should state where
the =net= package actually lives, which repository owns the scripts/tests,
whether =gtk4-layer-shell=, =python-gobject=, =librespeed-cli=, =gpg=, =nmcli=,
=curl=, and =resolvectl= are installed by archsetup or assumed present, and the
Makefile targets for test/lint/install. This blocks implementation because the
current path plan can produce code that is not installed on a fresh machine.

Disposition: accept — added the "Repository + dependencies" section: all code in
=~/.dotfiles= (=net/= package in-tree like pocketbook, scripts in the hyprland
tier, tests under =tests/=), archsetup owns only the dep install
(=gtk4-layer-shell=, =python-gobject=, =speedtest-go-bin=; nmcli/curl/resolvectl
already present), Makefile =make test= collects the package suite, and a
daily-drivers note for ratio. Rewrote Files touched to match.

** DONE Expand the test plan for failure, concurrency, and live verification :blocking:
The testing plan covers normal parsing and fake command sequences, but it misses
the riskiest behaviors: slow/hung =nmcli=/=curl=/=librespeed=, concurrent
=net status= cache refresh, corrupt cache, stale cache after SSID change,
permission denied, sudo declined, DNS override cleanup failure, NM partial
activation, duplicate connection names, secret redaction, missing optional
dependencies, no WiFi hardware, wired+tether+WiFi ambiguity, portal redirect
tokens, and Waybar click handlers. Add unit/fixture tests for each class plus a
manual/live checklist gated out of the normal suite. This is blocking because
the current plan would leave the exact "things that can go wrong here" mostly
untested.

Disposition: accept — rewrote the Testing plan with the "Failure + concurrency"
class (slow/hung commands, single-flight, corrupt/stale cache, perm-denied,
cleanup-failure, partial activation, redaction, missing deps, no-wifi,
multi-active) and a per-phase live checklist gated out of the suite.

** DONE Define status JSON schemas and compatibility rules
The spec says all subcommands take =--json= but does not define schemas. Add
versioned JSON examples for =status=, =probe=, =list=, =diagnose=, =speedtest=,
and error envelopes, including nullable fields and unknown/degraded states. This
is non-blocking for product direction but should be fixed before code so tests
can lock the CLI contract.

Disposition: accept — added the "JSON schemas" section with versioned (=v:1=)
envelopes for status / probe / list / diagnose / speedtest and a shared error
envelope, including the degraded/unknown states.

** DONE Rename or alias the phasing section for workflow compatibility
The spec has a usable =Phasing= section, but the spec-review workflow expects an
=Implementation phases= section that can be lifted into =todo.org=. Rename it or
add an alias heading during response. This is non-blocking because the existing
phase decomposition is understandable, but aligning the heading prevents future
workflow friction.

Disposition: accept — renamed =Phasing= → =Implementation phases= and added
per-phase acceptance criteria.

** DONE Add documentation and rollout acceptance checks
Rollback is described, but docs and rollout are thin. Add README/user-guide
updates for commands, panel behavior, config file, GPG opt-in, troubleshooting,
and rollback; add acceptance checks for each phase, including a fresh-login
Waybar smoke test and restoring =custom/netspeed=. This is non-blocking but
important for handing the feature to a future session without re-discovery.

Disposition: accept — added per-phase acceptance criteria under Implementation
phases (incl. the fresh-login waybar smoke test and the =custom/netspeed=
restore), a Phase 4 "Docs + rollout", and (answering Craig's cj follow-up) a
dedicated "Help + documentation" section with the three help layers (CLI help,
panel help affordance, user guide).

** DONE Add a failure-mode coverage table :blocking:
The spec now names many individual network failures, but it still does not carry
one compact coverage matrix that says, for each common failure mode, whether
=net diagnose= detects it, whether =net doctor --fix= can repair it, and what
terminal user action remains when it cannot. Add a table covering at least:
rfkill soft block, rfkill hard block, no WiFi hardware, associated/no DHCP,
gateway unreachable, captive DNS hijack, broken DNS where 1.1.1.1 works, HTTP
portal, HTTP interception without a parseable portal URL, upstream/AP outage,
wrong WPA password or missing secret, enterprise auth/cert failure, duplicate
SSID/connection-name ambiguity, hidden SSID, multiple active links, wedged
NetworkManager, slow/hung command, stale/corrupt cache, DNS cleanup failure,
missing speedtest backend, and VPN/routing interference. This blocks because
Craig asked for confidence that the diagnostics and doctor cover the real field
failures, and prose scattered across sections is too easy to misread.

Disposition: accept — added the "Failure-mode coverage" section: a 22-row table
(every mode the finding named) with detect / doctor-fix / terminal-action
columns, conformed to the org-table standard (rules under every row, ≤120).

** DONE Pin DNS repair semantics in doctor :blocking:
The spec diagnoses DNS hijack, broken hotel DNS, and the temporary 1.1.1.1
override test, but =net doctor --fix= does not say whether it merely recommends
the override, applies a temporary override during recovery, or leaves DNS alone
after diagnosis. Define the exact behavior for each DNS class: captive hijack
should open the portal, broken DNS where 1.1.1.1 works should either offer an
explicit temporary repair with cleanup verification or recommend the command,
and port-53/egress blocking should stop as upstream/not locally fixable. This is
blocking because DNS is one of the most common "connected but unusable" failures
and the current doctor contract is ambiguous.

Disposition: accept — added "DNS handling in doctor (explicit per class)" under
the new Doctor section: hijack → open portal (no DNS mutation); broken-but-1.1.1.1
→ explicit temporary override with cleanup verification under =--fix=, recommend
otherwise; egress-blocked → terminal =upstream-not-local=.

** DONE Make auth failures terminal user-action states :blocking:
Wrong WPA password, missing NM secret, locked keyring/polkit denial, enterprise
802.1X certificate/identity failure, and portal login-required are not fixed by
resetting or bouncing NetworkManager. The doctor sequence should classify these
as =needs-user-action= terminal states, stop before looping through destructive
repairs, and tell the user the exact next action (enter password, edit profile in
=nmtui=/=nmcli=, accept portal terms, provide cert/identity, or retry with
admin auth). This blocks because repeated reset/bounce against auth failures is
slow, noisy, and can make the network state worse without helping.

Disposition: accept — added the =needs-user-action= terminal outcome to the
Doctor section: wrong password / missing secret / keyring-or-polkit denial /
802.1X cert-or-identity failure / portal-login-required all stop the doctor before
any destructive repair and name the exact next step.

** DONE Define upstream/AP/provider failure terminal states :blocking:
Some failures are not client-repairable: AP has no uplink, hotel gateway is
down, DHCP server is broken, gateway drops traffic, ISP outage, or captive
portal backend is failing. The spec should define how =diagnose= proves "local
link is up but upstream is broken" and how =doctor --fix= stops after local
repairs are exhausted with a clear message like "local repairs tried; likely
upstream/AP/provider" plus the evidence. This blocks because users need to know
when to stop poking the laptop and switch networks or contact the venue.

Disposition: accept — added the =upstream-not-local= terminal outcome: diagnose
proves link-up + IP + gateway-reachable but no route out and no captive redirect;
=doctor --fix= stops after local repairs with "local repairs tried; likely
upstream/AP/provider" + evidence → switch network / contact venue.

** DONE Decide how VPN and policy routing affect v1 diagnosis
VPN/WireGuard management is Phase 5, but active VPNs, policy routes, DNS
overrides, and firewall killswitches can break apparent internet access in v1.
The current spec does not say whether v1 detects active VPN/policy routing and
classifies "network is fine, VPN route/DNS is broken" separately from WiFi
failure. Add either a v1 diagnostic check for active VPN/default-route/DNS
ownership with a "deferred repair" outcome, or explicitly state that VPN-routed
failures are out of scope and may be misclassified. This is blocking if Craig
expects the module to diagnose normal daily-driver network failures while VPN
tooling remains separate.

Disposition: accept (chose the detect-and-classify option) — v1 detects an active
VPN / non-NM default route / non-NM DNS owner and classifies =deferred/vpn= ("link
is fine; internet is VPN-routed"), distinct from a WiFi failure. v1 does not
repair it (VPN management is Phase 5); it names the VPN as the likely owner and
stops. Added to the Doctor section + the coverage table + a doctor-classification
test.

** DONE Remove stale GPG-store references from the resolved spec
The spec now decides "no separate credential store; secrets live in
NetworkManager", but the Testing plan still mentions =gpg round-trip= and =GPG
store= tests, and the panel-state list still mentions a no-GPG-key state. Remove
those stale references and replace them with NM-secret/no-secret-leak tests.
This is non-blocking for product behavior but blocking for implementation
clarity: otherwise tests will be written for a credential store that no longer
exists.

Disposition: accept — replaced the Testing-plan =gpg round-trip= / =GPG store=
bullets with an "NM secrets / no-leak" test (add/edit writes the secret via nmcli;
assert no PSK/EAP in any JSON/log/error; no store to round-trip) and dropped the
=no-GPG-key= panel state. Residue from the cj-comment pass that dropped the store.

** DONE Reconcile status, goal, and task text before implementation :blocking:
The spec status says "Implementation-ready with caveats" and "Phase 1 ready to
build", but the body still has an unresolved enterprise add/edit VERIFY, the
Goal still says "optional GPG-encrypted secret store", and the unified task title
still names "GPG-stored secrets" even though the accepted design removed the
store. Before implementation, make the top-level status, goal, scope, task
mapping, and resolved decisions agree with the current design. This blocks
readiness because a developer starting from the top of the file would still build
or plan around abandoned GPG-store behavior.

Disposition: accept — fixed the Goal ("secrets stay in NM's own store"), the
=[#B]= task-mapping line (notes the "GPG-stored secrets" framing is superseded by
decision 5), the enterprise VERIFY (now resolved → Status updated), and corrected
the stale =pytest= mentions to =unittest= (the repo's actual harness). Top-of-file
status/goal/scope/decisions now agree with the design.

** DONE Resolve enterprise add/edit scope or make the caveat explicit :blocking:
The spec still says "One open question for Craig: pull enterprise add/edit into
v1?" and points to a VERIFY in =todo.org=. That is a real product-scope decision:
if enterprise add/edit is in v1, panel forms, nmcli command sequences, tests,
error messages, and docs change materially; if it is out, the UI must consistently
show activate-only with "edit in nmtui/nmcli". Decide it in the spec before
implementation, or downgrade the status to =Ready with caveats= with this exact
accepted caveat. As written, the spec cannot be plain =Ready=.

Disposition: accept — Craig decided (2026-06-29): enterprise add/edit is vNext,
activate-only in v1. Settled in the Status line, the Scope/Out bullet, decision 9,
and the VERIFY (now DONE in todo.org). The UI shows activate-only with "edit in
nmtui/nmcli" consistently. Evidence: 24 saved profiles, 0 enterprise.

** DONE Define the concrete test harness and coverage gate :blocking:
The spec says TDD, fake binaries on PATH, and benchmark tests, but it does not
define the actual harness contract: pytest vs unittest for the =net= package,
where fake =nmcli=/=curl=/=speedtest-go=/=rfkill=/=resolvectl= live, how test
fixtures encode command histories, how subprocess timeouts are simulated, how
Waybar scripts are executed end-to-end, and how coverage is run. Add the exact
Makefile targets (=test=, =test-unit= or package-local =pytest=), pytest config,
coverage command (e.g. branch coverage over =net/= and =waybar-net= wrappers),
minimum threshold, and the rule for reading the coverage report to add missing
tests before declaring a phase done. This blocks readiness because "what is the
test harness?" is still answerable only by analogy to older suites.

Disposition: accept — added the "Harness + coverage gate" section. Corrected the
premise: the repo is =unittest= (=make test= → =python3 -m unittest=, 33 suites),
not pytest. Pinned the fake-binary stub convention (=tests/<name>/fake-*= on a
temp PATH), the fixture command→output map, timeout simulation, the end-to-end
=waybar-net= subprocess run, and coverage via a throwaway venv (coverage.py is
absent system-wide) with a ≥90% branch target on the pure modules.

** DONE Use coverage to find missing behavior, not just report a percentage :blocking:
The spec does not say how coverage findings affect implementation. For this
feature, line coverage alone can miss the important holes: doctor classification
branches, cleanup-unverified paths, redaction paths, degraded hot-path fallbacks,
timeout branches, and auth/upstream/VPN terminal states. Define coverage review
criteria per phase: branch coverage for pure classifiers and parsers, named
untested branches allowed only with comments or manual-check entries, and a
required "coverage gap pass" after the first green test run that maps uncovered
logic back to tests or consciously excluded live-only behavior. This blocks
readiness because the current test plan is broad but does not force the suite to
expose missing edge tests.

Disposition: accept — added the "Coverage as a gap-finder, not a number (per
phase)" subsection: branch coverage required for the doctor classifier (every
outcome), cleanup-unverified, redaction, degraded-fallback, timeout, and the
parsers; a mandatory coverage-gap pass after the first green run mapping each
uncovered branch to a test or a named live-only exclusion; a phase isn't done
until that pass is recorded.

** DONE Convert error classes into exact user-facing strings and evidence fields :blocking:
The failure table and doctor outcomes classify errors well, but many messages
are still templates or descriptions rather than final text. Add exact strings
for indicator tooltip, notification, CLI stderr, JSON =error.message=, and panel
banner/step text for every failure-mode row, including cases doctor cannot fix:
wrong password, missing secret, enterprise cert failure, upstream/AP/provider
failure, VPN-routed failure, hard rfkill block, DNS cleanup failure, speedtest
missing, and HTTP interception without parseable URL. For each string, specify
the redacted evidence included and the next action. This blocks UX readiness
because "useful error" is only testable once the actual text and evidence are
defined.

Disposition: accept — rewrote the Failure states section: each row now carries the
exact final string (with =<placeholder>= evidence), the evidence field, and the
next action, plus a per-surface rendering rule (indicator tooltip / notify /
CLI+JSON error.message+detail+code / panel banner all render the one canonical
string). Added the missing doctor-unfixable rows: hard rfkill, wrong password /
missing secret, enterprise cert failure, upstream/AP/provider, VPN-routed, HTTP
interception without a parseable URL, and DNS cleanup-unverified.

** DONE Add an enhancement disposition table
The spec captures several good enhancements (doctor, Makefile recovery, rfkill,
airplane absorption, VPN phase), but it does not show that low-cost adjacent
enhancements were considered and accepted/deferred/rejected. Add a small radar
table for likely affordances: copy redacted doctor report, open/copy portal URL,
retry with hardware MAC, forget network, rescan now, pin speedtest server, show
last good network/result, watch mode for =net doctor=, desktop notification
actions, QR-code/share WiFi import/export, and keyboard picker. Mark each
=v1=, =vNext=, or =rejected= with a one-line reason. This is non-blocking, but it
prevents accidental loss of cheap UX wins and keeps the v1 panel focused.

Disposition: accept — added the "Enhancement radar" table dispositioning all the
named affordances: open/copy portal URL, forget network, rescan, hardware-MAC
retry, pin speedtest server, copy redacted doctor report = v1; last-good
network/result, doctor watch mode, actionable notifications, keyboard picker =
vNext; QR-share = rejected (low value for a 2-machine personal setup).

** DONE Tighten the panel UX flow before Phase 2
The panel has sections and state machines, but not a concrete interaction flow:
default focused section, row content, primary/secondary buttons, disabled-state
rules, confirmation wording for reset/bounce/DNS override, how "Get me online"
reports each escalation, what stays visible after the panel closes, and keyboard
navigation. Add a short UX flow spec or wire-level outline before Phase 2. This
is non-blocking for Phase 1, but it blocks Phase 2 implementation because a GTK
panel can easily become noisy or surprising if these defaults are invented while
coding.

Disposition: accept — added the "Panel UX flow (settle before Phase 2)"
subsection: default focus (Connections, or Diagnose when opened from a captive
state), row content, one primary button per section, disabled-state rules, exact
confirmation wording for reset/bounce/DNS-override/remove, the live "Get me
online" escalation reporting, what survives panel close, and keyboard nav.

* Review and iteration history

** 2026-06-29 Mon @ 17:00:39 -0400 — Codex — reviewer

- *What changed or was recommended:* Rubric: =Not ready=. Applied the
  spec-review workflow and added blocking findings for diagnostics structure,
  user-facing errors, observability, nmcli contracts, cache concurrency,
  hot-path performance, Waybar click actions, mutation rollback, credential
  security, unsupported WiFi types, panel states, speed-test dependency
  verification, install boundaries, and test coverage.
- *Why:* The spec has the right broad shape, but this feature combines a status
  indicator, NetworkManager mutation, sudo-backed remediation, captive-portal
  probing, secret storage, and GTK UI. Too many failure modes were still left for
  implementers to invent during coding.
- *Artifacts:* Findings recorded in =Review findings=. Local code read included
  =~/.dotfiles/hyprland/.local/bin/waybar-netspeed=,
  =~/.dotfiles/common/.local/bin/captive=,
  =~/.dotfiles/tests/waybar-netspeed/test_waybar_netspeed.py=,
  =~/.dotfiles/tests/captive/test_captive.py=, pocketbook's GTK layer-shell
  scaffold, current Waybar config, and the linked =todo.org= tasks. External
  checks included the Arch nmcli man page and upstream LibreSpeed CLI page.

** 2026-06-29 Mon @ 17:35:00 -0400 — Claude Code (archsetup) — responder

- *What changed:* Processed all 18 findings — 13 accepted, 5 modified, 0
  rejected. Accepts added new sections: Repository + dependencies, Performance
  budgets, nmcli contract, JSON schemas, Diagnostics contract (+ Diagnose vs
  Repair), Failure states/messages/recovery, Observability, cache
  concurrency/atomicity/staleness, mutation safety + rollback, the rewritten
  credential security model, the expanded test plan, JSON schemas, the renamed
  Implementation phases with per-phase acceptance criteria, and a Status line.
  Modifies: deferred the =net debug-bundle= command (log + =diagnose --json=
  cover it), made click actions phase-aware + non-blocking, scoped enterprise
  WiFi to activate-only (add/edit vNext), trimmed the panel state set to states
  that occur on the two-machine fleet, and switched the speed-test backend to the
  already-installed =speedtest-go= (verified live) instead of a new
  librespeed-cli dependency.
- *Why:* Codex's review was strong and mostly right; the modifies are scope calls
  for a single-user, two-machine tool (avoid gold-plating, use installed deps,
  don't model impossible states) and one phasing reconciliation (the panel is
  Phase 2, so Phase-1 clicks can't open it).
- *Artifacts:* All 18 findings completed =DONE= in place with dispositions. Live
  check: =pacman -Qq | grep speed= (speedtest-go-bin installed; =speedtest-go
  --json= confirmed). Status now "review incorporated (Codex, 2026-06-29)",
  implementation-ready with the named Phase-2/3 caveats; Phase 1 ready to build.

** 2026-06-29 Mon @ 17:37:58 -0400 — Claude Code (archsetup) — responder (cj comments)

- *What changed:* Folded in Craig's 10 cj comments on the spec. Design changes:
  (1) dropped the separate credential store entirely — secrets stay in NM's own
  =system-connections= (root =0600=, inline), touched via nmcli; no GPG, no
  gpg-agent (rewrote the secrets section, decision 5, dropped Phase 4 + the gpg
  dep). (2) Added =net doctor [--fix]= + Makefile console-recovery targets
  (=make online= etc.) as a first-class TTY path; reversed the earlier
  defer-the-doctor call (decision 10). (3) Added a full-stack =bounce= repair and
  an =rfkill= unblock repair + indicator state — the rfkill one recovers the
  framework-laptop post-power-loss soft-block Craig hit. (4) =custom/net= absorbs
  the airplane module; the standalone airplane scripts/tests/module are deleted on
  ship. (5) Moved VPN/WireGuard from "out" to a planned Phase 5. (6) Added a
  "Help + documentation" section (CLI help / panel help / user guide). Answered
  the enterprise-defer rationale and the captive-auto-login explanation inline.
- *Why:* Craig's comments simplify (NM-only secrets, fewer deps) and harden the
  recovery story (doctor + make targets + rfkill/bounce reachable from a dead-GUI
  console — his stated need). Enterprise add/edit in v1 is the one open call,
  raised as a VERIFY.
- *Artifacts:* All 10 cj blocks removed. Live checks: airplane scripts/tests
  present (confirmed deletion targets); =/etc/NetworkManager/system-connections/=
  is root =0600= with inline secrets (confirms the NM-only secret model);
  =rfkill= present. VERIFY filed in =todo.org= for the enterprise-scope decision.

** 2026-06-29 Mon @ 17:49:24 -0400 — Codex — reviewer

- *What changed or was recommended:* Added a second feedback batch from the
  common network failure-mode review: require a diagnose/doctor coverage table,
  pin DNS repair semantics, make auth failures terminal user-action states,
  define upstream/AP/provider terminal states, decide how active VPN/policy
  routing affects v1 diagnosis, and remove stale GPG-store references.
- *Why:* The design now has a solid recovery shape, but the field question is
  sharper: for each common failure, does =diagnose= catch it, does =doctor --fix=
  fix it, or does the user need to switch networks/provide credentials/contact
  the venue? That mapping needs to live in the spec, not in chat.
- *Artifacts:* Findings appended under =Review findings= after reviewing the
  current spec's Diagnostics, Failure states, Doctor, Testing plan, and
  responder history.

** 2026-06-29 Mon @ 18:06:22 -0400 — Codex — reviewer

- *What changed or was recommended:* Added a readiness pass focused on
  development-time questions, missed enhancements, UX cleanliness, error wording,
  and test/coverage mechanics. New findings require reconciling stale status/goal
  text, resolving or explicitly caveating enterprise add/edit, defining the
  concrete harness and coverage gate, making coverage actionable, turning failure
  classes into exact user-facing strings, adding an enhancement disposition
  table, and tightening the Phase-2 panel UX flow.
- *Why:* The network failure model is now strong, but implementation would still
  force developers to invent test infrastructure, coverage expectations, final
  error copy, and several UX details. The spec should decide those up front or
  explicitly mark them as caveats.
- *Artifacts:* Findings appended under =Review findings= after reading the
  updated spec, current archsetup Makefile test targets, pocketbook's pyproject,
  and the linked =todo.org= enterprise VERIFY.

** 2026-06-29 Mon @ 17:55:08 -0400 — Claude Code (archsetup) — responder (2nd review)

- *What changed:* Processed all 6 second-batch findings — 6 accepted, 0 modified,
  0 rejected. Added a "Doctor: escalation, classification, terminal states"
  section with four outcomes (=fixable= escalating lightest-first; =needs-user-
  action=, =upstream-not-local=, =deferred/vpn= terminal) and explicit per-class
  DNS handling. Added a "Failure-mode coverage" table (22 rows: detect / doctor-fix
  / terminal action, org-table-standard conformant, ≤120). Cleaned the last
  GPG-store residue (Testing plan → NM-secret/no-leak test; dropped the no-GPG-key
  panel state). Added a doctor-classification test class.
- *Why:* The findings sharpen the doctor from "escalate repairs" to "classify,
  then stop at the right terminal state" — the safety property that keeps
  =doctor --fix= from looping destructive repairs against a wrong password or an
  upstream outage. The VPN finding got the detect-and-classify option (not repair;
  that's Phase 5) so v1 doesn't misread a VPN-routed failure as WiFi down. The
  coverage table is the single artifact Craig asked for to trust the field
  coverage.
- *Artifacts:* All 6 findings =DONE= in place; cookie =[24/24]=. Table conformed
  via =wrap-org-table.el= (max line 120). No new findings introduced; no scope
  expansion past what the findings asked. Spec remains implementation-ready, the
  enterprise add/edit VERIFY still the one open decision.

** 2026-06-29 Mon @ 18:24:51 -0400 — Claude Code (archsetup) — responder (3rd review)

- *What changed:* Processed the third review (7 findings). The enterprise-scope
  finding closed first on Craig's call (vNext, activate-only). The other 6, all
  accepted: (1) reconciled the top-of-file text — fixed the Goal's GPG-store
  wording, the =[#B]= task-mapping line, the resolved enterprise VERIFY, and
  corrected the stale =pytest= mentions to =unittest= (the repo's real harness).
  (2) Added a "Harness + coverage gate" section (unittest, fake-binary stubs on a
  temp PATH, venv coverage, ≥90% branch on pure modules). (3) Added a per-phase
  "coverage as a gap-finder" pass. (4) Rewrote the Failure states section to exact
  final strings + evidence fields + a per-surface rendering rule, and added the
  missing doctor-unfixable rows. (5) Added the "Enhancement radar" table
  (v1/vNext/rejected). (6) Added the "Panel UX flow" subsection.
- *Why:* The findings close the gap between "design decided" and "a developer can
  start": the harness/coverage contract, the exact UX strings, and the panel flow
  are the things otherwise invented mid-code. The =pytest=→=unittest= correction
  was a real defect — the spec contradicted the repo's actual test convention.
- *Artifacts:* All 31 findings =DONE=; cookie =[31/31]=. Both new tables conformed
  via =wrap-org-table.el= (coverage 120, radar 110). Harness verified against the
  live repo (33 unittest suites, =make test=, coverage.py absent → venv). Status
  raised to "Ready for Phase 1; Ready-with-caveats overall" — no open decisions
  remain.