docs/design/gptel-network-tools.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407

#+TITLE: Design: gptel network tools
#+AUTHOR: Craig Jennings
#+DATE: 2026-05-16
#+OPTIONS: toc:nil num:nil

* Status

Draft.  Brainstorm output captured from a =/brainstorm= session on
2026-05-16.  Sibling to
=docs/design/gptel-git-tools-magit-backend.org= and the broader theme
hierarchy under =** TODO [#B] GPTel Tool Work= in =todo.org=.

The conventional vs tail-sample exploration covered three categories
(network, text/data, build/code).  Network was selected as the next
build target; this doc captures the network slice in full.  The other
two categories are referenced briefly and live as theme stubs under
=*** TODO [#B] Filesystem Related Tools= and
=*** TODO [#B] Development Workflow Related Tools= in =todo.org=.

* Problem

The current =gptel-tools/= set covers filesystem CRUD, web fetch, and
git status/log/diff.  When the user asks the agent "why can't I reach
X?" or "what's on my LAN right now?" the agent has no affordances --
it can only suggest commands the user runs manually.

Network diagnosis is a recurring task on this laptop (homelab, mixed
wifi/wired, occasional VPN, NetworkManager-managed connections).  The
agent should be able to run read-only network probes directly, return
structured findings, and synthesize an explanation.  Anything that
mutates network state (=nmcli connection up=, route changes) stays
behind =:confirm t=.

* Non-goals

- Active offensive scanning, vulnerability probes, or exploitation
  tooling.  Out of scope at the wrapper boundary -- nmap's
  =-A=/=-O=/aggressive modes are rejected, NSE is deferred.
- Scanning networks the user doesn't own.  Public targets are gated
  behind an explicit =external=t= flag and =:confirm t=.
- Real-time/streaming inspection (=iftop=, =nethogs=, =tcpdump
  follow=).  Snapshot tools only; streaming tools don't fit the
  request/response shape of gptel tools.
- Replacing Magit's git tooling, mu4e's mail handling, or any other
  Emacs-native workflow.  Network tooling is the gap.

* Approaches considered

The =/brainstorm= run generated six candidate themes across three
categories.  Three conventional (high-prior), three tail samples
(genuinely different regions of the option space).  Network was
chosen as the first build target; the others are recorded for
follow-up sessions.

** Recommended: network triage bundle (conventional #1)

Five tools covering discovery, diagnostics, and inspection:

| Tool              | Purpose                                          |
|-------------------+--------------------------------------------------|
| =net_diagnose=    | "Why can't I reach X?" -- composite probe        |
| =net_discover=    | "What's on this subnet?" -- LAN host discovery   |
| =net_services=    | "What's listening on host X?" -- service detect  |
| =network_status=  | "What's my current network state?" -- snapshot   |
| =dns_lookup=      | Typed DNS query (A/AAAA/MX/NS/TXT/SRV/CAA)       |

Detailed in =* Design= below.

*** Pros

- Hits the highest-leverage daily question (connectivity diagnosis)
  with a single mental entry point (=net_diagnose=).
- Atomic tools (=dns_lookup=, =network_status=) for cases the
  composite is too coarse for.
- All read-only at the network layer; =:confirm nil= for RFC1918,
  =:confirm t= for public targets.
- nmap's two genuinely-unique capabilities (subnet discovery, service
  enumeration) get first-class wrappers.

*** Cons

- Five tools is heavy for one category.  Some are thin wrappers around
  a single command.
- Composite =net_diagnose= hides which sub-check fired; debugging the
  tool itself is harder than debugging atomic tools.
- nmap is the one tool that *can* get the user in trouble.  Target
  gating must be airtight or it's the wrong tool to ship.

** Rejected: code-quality fan-out (conventional #2)

=shellcheck_run=, =format_check= (black/prettier/gofmt/rustfmt/elisp,
returns unified diff), =lint_run= (eslint/ruff/golangci-lint),
=dot_render=, =mermaid_render=.

Folded into =*** TODO [#B] Development Workflow Related Tools= as
per-language work rather than a standalone bundle.  Most of the per-
language wins land in the existing prog-*.el modules' format-on-save
and LSP attachments; the agent benefits more from /reading/ those
buffers than from re-running the formatters via tool calls.

** Rejected: GitHub workspace (conventional #3)

=gh_pr_view=, =gh_issue_search=, =gh_run_logs=, =gh_pr_diff=.

Overlaps with the magit-backend track (=gptel-git-tools-magit-backend=)
for several queries.  Better treated as a follow-on once the magit
backend lands -- some queries are local (magit) and some are remote
(gh), and the seam is clearer after the local side is built.

** Rejected: DNS-chain inspector (tail sample)

=dns_chain= walks NS -> A/AAAA -> MX -> SPF -> DMARC -> DKIM for a
domain and returns a structured assessment with red flags ("MX
missing TLS-RPT", "SPF includes >10 lookups", "DMARC policy=none").

Real value when it's useful but probably 5 calls/year for this
laptop.  =dns_lookup= covers 90% of the recurring need; the chain
walker is parked for a possible follow-on.

** Rejected: awk_eval / sed_eval with explanation (tail sample)

Accept snippet + sample input, return both the transformed output and
a plain-English explanation of what the snippet does.

Doubles work the model already does internally -- the model is
already good at generating and explaining awk/sed.  Real win would
only be the actual execution against actual data, which the eshell
escape hatch in the Filesystem section already covers.

** Adopted as project convention: plan/apply split (tail sample)

=rsync_plan= / =rsync_apply= split: plan always runs =--dry-run= and
returns the file list and byte counts that *would* transfer; apply is
a separate tool registration with =:confirm t=.  Same shape for
=nmcli= (status read vs connection mutate) and any other mutating
tool.

Promoted to a documented convention rather than a single tool: any
mutating wrapper in =gptel-tools/= should split into a preview and an
apply.  The preview is =:confirm nil= so the agent can plan
autonomously; the apply is =:confirm t= and stops cleanly for human
review.  Applies to =rsync=, =nmcli connection up=, =ssh= mutations,
and the pandoc/ffmpeg/imagemagick output-writing tools in the
Filesystem section.

* Design

** Tool 1: =net_diagnose=

Composite "why can't I reach X?" probe.  Given a target (hostname or
IP), runs a sequence of sub-checks and returns a structured result:

1. =dig +short= on the name (skip if target is an IP literal).
2. =ping -c 3 -W 2= against the resolved IP.
3. =traceroute -n -w 2 -q 1 -m 20= to the IP.
4. If a port is given: =curl --max-time 5 -o /dev/null -sw '%{http_code}\n'=
   for ports 80/443, or =nc -zv -w 3= for arbitrary TCP ports.

Output shape (alist or plist returned to the model):

#+begin_src text
  ((target . "example.com")
   (resolved-to . "93.184.216.34")
   (dns-time-ms . 12)
   (ping . ((sent . 3) (received . 3) (avg-ms . 14.2)))
   (traceroute . ((hops . 8) (last-hop . "93.184.216.34")))
   (port-check . ((port . 443) (status . "200") (tls . "ok"))))
#+end_src

Caps: total runtime <30s.  Each sub-check has its own timeout.  If a
sub-check fails (no ping reply, no route, no DNS), the field carries
the failure mode rather than aborting the whole call -- the agent
needs the partial picture to reason.

=:confirm nil=.  Read-only.

** Tool 2: =net_discover=

Wraps =nmap -sn <subnet>= for LAN host discovery.  Two argv shapes:

- =net_discover ()= -- defaults to the current LAN, derived from
  =ip route get 1.1.1.1= and the matching interface's =/24=.
- =net_discover :subnet "192.168.1.0/24"= -- explicit subnet.

Guardrails:

- Subnet must be RFC1918, link-local (169.254/16), CGNAT (100.64/10),
  or loopback.  Public subnets rejected at the validator.
- Subnet mask must be /22 or smaller (no /16 or wider).  At /22 that's
  ~1024 hosts -- enough for any homelab.  Default home network is /24.
- =--host-timeout 30s --max-retries 1= to bound runtime.

Output: list of =(ip mac hostname state)= tuples.

=:confirm nil= for RFC1918 / link-local / CGNAT / loopback.  Public
subnets never reach this tool (validator rejects).

** Tool 3: =net_services=

Wraps =nmap -sV= for service/version detection on a single host.

Argv:

- =:host= -- required.  RFC1918 / link-local / CGNAT / loopback by
  default.  Public hosts require =:external t= which flips
  =:confirm t=.
- =:ports= -- optional port spec.  Default: top-100 (=--top-ports
  100=).  Custom lists allowed: ="22,80,443,5432,6379"= or
  ="1-1024"=.  Hard cap: 1024 ports total.
- =:fast= -- if t, uses =--top-ports 20= for a quick check.

Mode allowlist enforced at the wrapper: only =-sV= with optional
=-p=.  Reject =-A=, =-O=, =-T4=/=-T5=, =--script=, raw-packet flags.

Output: list of =(port protocol state service version banner)=
tuples, parsed from =-oG -= (greppable output).

=:confirm nil= for RFC1918 / link-local / CGNAT / loopback.
=:confirm t= for any target reachable only as a public IP/hostname.

** Tool 4: =network_status=

Snapshot of the local network state.  Composite of:

- =ip -br addr= -- interfaces and their addresses.
- =ip route= -- routing table.
- =nmcli -t -f NAME,TYPE,DEVICE,STATE connection show --active= --
  active NetworkManager connections.
- =ss -tulpn= (or =netstat -tulpn= fallback) -- listening sockets.
- =resolvectl status= (or =/etc/resolv.conf= fallback) -- DNS
  resolver state.

Output: structured alist with sections for each.

=:confirm nil=.  Read-only.

Note: this is also the candidate target for the plan/apply split if
=nmcli connection up=/=down= ever lands as a tool -- =network_status=
becomes the "plan" side and any mutation is a separate tool.

** Tool 5: =dns_lookup=

Typed DNS query.  Argv:

- =:name= -- required.  The DNS name to query.
- =:type= -- record type.  Default =A=.  Allowed: =A=, =AAAA=, =MX=,
  =NS=, =TXT=, =SRV=, =CAA=, =CNAME=, =PTR=, =SOA=.
- =:server= -- optional resolver.  Default uses system resolver.
  When set, must be RFC1918 or one of a small allowlist (=1.1.1.1=,
  =8.8.8.8=, =9.9.9.9=) so the tool can't be used to probe arbitrary
  hosts via DNS.

Output: list of records with TTL.  For =MX= and =SRV=, includes
priority/weight/port.  For =TXT=, the records are split into the
quoted segments dig returns.

=:confirm nil=.  Read-only.

** Shared helpers

In =gptel-tools/network_tools.el= (single file, mirrors the
magit-backend plan for git tools):

- =cj/gptel-net--validate-target HOST &optional ALLOW-PUBLIC=
  - Resolves HOST.  Rejects unless resolved IP is RFC1918 /
    link-local / CGNAT / loopback, unless ALLOW-PUBLIC is non-nil.
  - Returns the resolved IP on success.

- =cj/gptel-net--validate-subnet CIDR=
  - Rejects non-private subnets and subnets wider than /22.
  - Returns =(network mask)= on success.

- =cj/gptel-net--current-lan=
  - Derives the current /24 from =ip route get 1.1.1.1=.

- =cj/gptel-net--run ARGS &key TIMEOUT=
  - Wraps =process-file= with a uniform timeout, color/encoding
    posture, and structured return =(exit-code stdout stderr)=.

- =cj/gptel-net--parse-nmap-greppable STRING=
  - Parses nmap =-oG -= output into structured tuples.

- =cj/gptel-net--truncate TEXT MAX-BYTES=
  - Same shape as the existing per-tool truncate helpers.  Open
    question whether this consolidates into =system-lib.el= alongside
    the matching helpers in =web_fetch.el= and =update_text_file.el=.

** Caps

| Tool             | Default cap            | Hard cap               |
|------------------+------------------------+------------------------|
| =net_diagnose=   | <30s total runtime     | <30s total runtime     |
| =net_discover=   | /24 default, /22 max   | /22                    |
| =net_services=   | top-100 ports          | 1024 ports             |
| =network_status= | uncapped (snapshot)    | uncapped               |
| =dns_lookup=     | uncapped               | uncapped               |

** =:confirm= posture

| Tool             | RFC1918 target    | Public target           |
|------------------+-------------------+-------------------------|
| =net_diagnose=   | =:confirm nil=    | =:confirm t=            |
| =net_discover=   | =:confirm nil=    | rejected at validator   |
| =net_services=   | =:confirm nil=    | =:confirm t=            |
| =network_status= | =:confirm nil=    | n/a (local snapshot)    |
| =dns_lookup=     | =:confirm nil=    | =:confirm nil=          |

=dns_lookup= stays =:confirm nil= for public names because DNS is
read-only and innocuous.  =net_diagnose= and =net_services= against
public targets are gated because pinging/probing public hosts isn't
*illegal* but it can trip rate-limits or get the user flagged on a
managed network.

** Tests

Single file =tests/test-gptel-tools-network-tools.el=.  Real subnets
are not available in CI, so:

- =net_discover= and =net_services= are stubbed via =cl-letf= on
  =cj/gptel-net--run=, returning canned nmap output.  Real nmap
  invocation tested via one =:tags '(:integration)= test that runs
  =nmap -sn 127.0.0.1/32= and asserts the parser handles the real
  format.
- =net_diagnose= sub-checks stubbed individually so each failure mode
  can be exercised.
- =network_status= sections stubbed per-command; one integration test
  runs against the live system and asserts the structure parses.
- =dns_lookup= stubbed against canned =dig= output; one integration
  test against =localhost= via the system resolver.

Rough count: ~12 shared-helper tests (validators, current-lan
detector, parsers) + ~7 per tool x 5 tools = ~47 tests.

** Risk surface

| Risk                                                      | Mitigation                                                          |
|-----------------------------------------------------------+---------------------------------------------------------------------|
| nmap scan against an unintended target                    | Validator gates on resolved IP, not on the input string.  Public    |
|                                                           | targets require explicit =:external t= flag + =:confirm t=.         |
| Scan triggers IDS/IPS on a corporate/managed network      | Default modes are non-aggressive (=-sn=, =-sV= only).  No =-A=, no  |
|                                                           | NSE, no high T-level.  =:confirm t= for non-RFC1918 targets gives   |
|                                                           | the user a manual checkpoint.                                       |
| =net_diagnose= hangs on a slow target                     | Per-sub-check timeouts; total runtime cap; partial-failure return   |
|                                                           | rather than abort.                                                  |
| nmap not installed on the system                          | =:command= check at module load via =cj/executable-find-or-warn=    |
|                                                           | (matching the prettier/pyright pattern documented in CLAUDE.md).    |
| Network tools shell out via =process-file=                | argv-list invocation, no shell.  =shell-quote-argument= unused      |
|                                                           | because no shell is involved.                                       |
| /tmp pollution or banner output writing to disk           | All output captured to buffer via =process-file=, never written.    |

* Open questions

1. *Default port set for =net_services=.*  Top-100 (nmap default),
   top-1000 (full default scan, slower), or a custom homelab-tuned
   list (=22, 80, 443, 445, 3389, 5432, 6379, 8080, 8443, 9090, 9000,
   631=)?  My read: top-100 default + =:fast t= for top-20 + custom
   override for the homelab list when needed.
2. *NSE in v1 or deferred?*  Skip entirely (clean v1) or ship a small
   allowlist (=ssl-cert=, =http-title=, =ssh-hostkey=)?  My read:
   skip in v1.  If a real use case shows up (TLS audit), add a single
   =net_tls_audit= tool wrapping just =ssl-enum-ciphers=/=ssl-cert=
   rather than a generic NSE escape hatch.
3. *Consolidate the truncate helper.*  Same open question as the
   magit-backend doc: move =cj/gptel-net--truncate= and its siblings
   into =system-lib.el= as =cj/gptel-tools--truncate-bytes=, or keep
   per-module?  My read: consolidate when there are three callers
   (web_fetch, update_text_file, network_tools all qualify).
4. *Composite vs atomic for =net_diagnose=.*  Build it as one
   composite, or break it into =ping_run=, =traceroute_run=,
   =port_check= and let the agent compose?  My read: composite is
   better -- the agent reasons in "diagnose-this-target" terms more
   often than in "just-ping-this".  Atomic sub-tools can be added
   later if the composite proves coarse-grained.
5. *Promote plan/apply split to documented convention now?*  Or wait
   until a second tool exercises it (post-rsync)?  My read: document
   the convention in the Filesystem section body now, since pandoc /
   ffmpeg / imagemagick all benefit, even before any of them ship.
6. *nmcli mutation tools.*  Out of scope for this doc but worth
   flagging: =nmcli connection up <name>= / =nmcli connection down
   <name>= / =nmcli device wifi connect <ssid>=.  These would be the
   first apply-side tools under the plan/apply convention, with
   =network_status= as the plan side.

* Effort estimate

M (1-3 hours).  Five tools + shared helpers + ~47 tests.  Most of the
time is test authoring (canned nmap output, dig output, ss output);
production code is small because each tool is a thin =process-file=
wrapper plus a parser.

* Next steps

- Resolve open questions #1 and #2 before any code lands (the
  =net_services= shape can't be finalized without them).
- Once approved, the work attaches to =*** TODO [#B] (Network bundle:
  net_diagnose / net_discover / net_services / network_status /
  dns_lookup)= -- a new theme under =*** TODO [#B] (Networking tools
  category)= which itself becomes a new top-level under =** TODO [#B]
  GPTel Tool Work= in =todo.org=, peer to the existing Filesystem
  section.
- Implementation follows =/start-work= flow: TDD, characterization
  tests for the parsers first (canned nmap/dig/ss fixtures), then
  the wrappers, then the registrations in
  =cj/gptel-local-tool-features=.
- After landing, revisit candidate #6 (plan/apply split) -- the
  first apply-side tool (=nmcli connection up=, =rsync_apply=,
  pandoc-output) exercises the convention end-to-end.