aboutsummaryrefslogtreecommitdiff
path: root/docs/design/gptel-network-tools.org
diff options
context:
space:
mode:
Diffstat (limited to 'docs/design/gptel-network-tools.org')
-rw-r--r--docs/design/gptel-network-tools.org407
1 files changed, 0 insertions, 407 deletions
diff --git a/docs/design/gptel-network-tools.org b/docs/design/gptel-network-tools.org
deleted file mode 100644
index aae2cc2a8..000000000
--- a/docs/design/gptel-network-tools.org
+++ /dev/null
@@ -1,407 +0,0 @@
-#+TITLE: Design: gptel network tools
-#+AUTHOR: Craig Jennings
-#+DATE: 2026-05-16
-#+OPTIONS: toc:nil num:nil
-
-* Status
-
-Draft. Brainstorm output captured from a =/brainstorm= session on
-2026-05-16. Sibling to
-=docs/design/gptel-git-tools-magit-backend.org= and the broader theme
-hierarchy under =** TODO [#B] GPTel Tool Work= in =todo.org=.
-
-The conventional vs tail-sample exploration covered three categories
-(network, text/data, build/code). Network was selected as the next
-build target; this doc captures the network slice in full. The other
-two categories are referenced briefly and live as theme stubs under
-=*** TODO [#B] Filesystem Related Tools= and
-=*** TODO [#B] Development Workflow Related Tools= in =todo.org=.
-
-* Problem
-
-The current =gptel-tools/= set covers filesystem CRUD, web fetch, and
-git status/log/diff. When the user asks the agent "why can't I reach
-X?" or "what's on my LAN right now?" the agent has no affordances --
-it can only suggest commands the user runs manually.
-
-Network diagnosis is a recurring task on this laptop (homelab, mixed
-wifi/wired, occasional VPN, NetworkManager-managed connections). The
-agent should be able to run read-only network probes directly, return
-structured findings, and synthesize an explanation. Anything that
-mutates network state (=nmcli connection up=, route changes) stays
-behind =:confirm t=.
-
-* Non-goals
-
-- Active offensive scanning, vulnerability probes, or exploitation
- tooling. Out of scope at the wrapper boundary -- nmap's
- =-A=/=-O=/aggressive modes are rejected, NSE is deferred.
-- Scanning networks the user doesn't own. Public targets are gated
- behind an explicit =external=t= flag and =:confirm t=.
-- Real-time/streaming inspection (=iftop=, =nethogs=, =tcpdump
- follow=). Snapshot tools only; streaming tools don't fit the
- request/response shape of gptel tools.
-- Replacing Magit's git tooling, mu4e's mail handling, or any other
- Emacs-native workflow. Network tooling is the gap.
-
-* Approaches considered
-
-The =/brainstorm= run generated six candidate themes across three
-categories. Three conventional (high-prior), three tail samples
-(genuinely different regions of the option space). Network was
-chosen as the first build target; the others are recorded for
-follow-up sessions.
-
-** Recommended: network triage bundle (conventional #1)
-
-Five tools covering discovery, diagnostics, and inspection:
-
-| Tool | Purpose |
-|-------------------+--------------------------------------------------|
-| =net_diagnose= | "Why can't I reach X?" -- composite probe |
-| =net_discover= | "What's on this subnet?" -- LAN host discovery |
-| =net_services= | "What's listening on host X?" -- service detect |
-| =network_status= | "What's my current network state?" -- snapshot |
-| =dns_lookup= | Typed DNS query (A/AAAA/MX/NS/TXT/SRV/CAA) |
-
-Detailed in =* Design= below.
-
-*** Pros
-
-- Hits the highest-leverage daily question (connectivity diagnosis)
- with a single mental entry point (=net_diagnose=).
-- Atomic tools (=dns_lookup=, =network_status=) for cases the
- composite is too coarse for.
-- All read-only at the network layer; =:confirm nil= for RFC1918,
- =:confirm t= for public targets.
-- nmap's two genuinely-unique capabilities (subnet discovery, service
- enumeration) get first-class wrappers.
-
-*** Cons
-
-- Five tools is heavy for one category. Some are thin wrappers around
- a single command.
-- Composite =net_diagnose= hides which sub-check fired; debugging the
- tool itself is harder than debugging atomic tools.
-- nmap is the one tool that *can* get the user in trouble. Target
- gating must be airtight or it's the wrong tool to ship.
-
-** Rejected: code-quality fan-out (conventional #2)
-
-=shellcheck_run=, =format_check= (black/prettier/gofmt/rustfmt/elisp,
-returns unified diff), =lint_run= (eslint/ruff/golangci-lint),
-=dot_render=, =mermaid_render=.
-
-Folded into =*** TODO [#B] Development Workflow Related Tools= as
-per-language work rather than a standalone bundle. Most of the per-
-language wins land in the existing prog-*.el modules' format-on-save
-and LSP attachments; the agent benefits more from /reading/ those
-buffers than from re-running the formatters via tool calls.
-
-** Rejected: GitHub workspace (conventional #3)
-
-=gh_pr_view=, =gh_issue_search=, =gh_run_logs=, =gh_pr_diff=.
-
-Overlaps with the magit-backend track (=gptel-git-tools-magit-backend=)
-for several queries. Better treated as a follow-on once the magit
-backend lands -- some queries are local (magit) and some are remote
-(gh), and the seam is clearer after the local side is built.
-
-** Rejected: DNS-chain inspector (tail sample)
-
-=dns_chain= walks NS -> A/AAAA -> MX -> SPF -> DMARC -> DKIM for a
-domain and returns a structured assessment with red flags ("MX
-missing TLS-RPT", "SPF includes >10 lookups", "DMARC policy=none").
-
-Real value when it's useful but probably 5 calls/year for this
-laptop. =dns_lookup= covers 90% of the recurring need; the chain
-walker is parked for a possible follow-on.
-
-** Rejected: awk_eval / sed_eval with explanation (tail sample)
-
-Accept snippet + sample input, return both the transformed output and
-a plain-English explanation of what the snippet does.
-
-Doubles work the model already does internally -- the model is
-already good at generating and explaining awk/sed. Real win would
-only be the actual execution against actual data, which the eshell
-escape hatch in the Filesystem section already covers.
-
-** Adopted as project convention: plan/apply split (tail sample)
-
-=rsync_plan= / =rsync_apply= split: plan always runs =--dry-run= and
-returns the file list and byte counts that *would* transfer; apply is
-a separate tool registration with =:confirm t=. Same shape for
-=nmcli= (status read vs connection mutate) and any other mutating
-tool.
-
-Promoted to a documented convention rather than a single tool: any
-mutating wrapper in =gptel-tools/= should split into a preview and an
-apply. The preview is =:confirm nil= so the agent can plan
-autonomously; the apply is =:confirm t= and stops cleanly for human
-review. Applies to =rsync=, =nmcli connection up=, =ssh= mutations,
-and the pandoc/ffmpeg/imagemagick output-writing tools in the
-Filesystem section.
-
-* Design
-
-** Tool 1: =net_diagnose=
-
-Composite "why can't I reach X?" probe. Given a target (hostname or
-IP), runs a sequence of sub-checks and returns a structured result:
-
-1. =dig +short= on the name (skip if target is an IP literal).
-2. =ping -c 3 -W 2= against the resolved IP.
-3. =traceroute -n -w 2 -q 1 -m 20= to the IP.
-4. If a port is given: =curl --max-time 5 -o /dev/null -sw '%{http_code}\n'=
- for ports 80/443, or =nc -zv -w 3= for arbitrary TCP ports.
-
-Output shape (alist or plist returned to the model):
-
-#+begin_src text
- ((target . "example.com")
- (resolved-to . "93.184.216.34")
- (dns-time-ms . 12)
- (ping . ((sent . 3) (received . 3) (avg-ms . 14.2)))
- (traceroute . ((hops . 8) (last-hop . "93.184.216.34")))
- (port-check . ((port . 443) (status . "200") (tls . "ok"))))
-#+end_src
-
-Caps: total runtime <30s. Each sub-check has its own timeout. If a
-sub-check fails (no ping reply, no route, no DNS), the field carries
-the failure mode rather than aborting the whole call -- the agent
-needs the partial picture to reason.
-
-=:confirm nil=. Read-only.
-
-** Tool 2: =net_discover=
-
-Wraps =nmap -sn <subnet>= for LAN host discovery. Two argv shapes:
-
-- =net_discover ()= -- defaults to the current LAN, derived from
- =ip route get 1.1.1.1= and the matching interface's =/24=.
-- =net_discover :subnet "192.168.1.0/24"= -- explicit subnet.
-
-Guardrails:
-
-- Subnet must be RFC1918, link-local (169.254/16), CGNAT (100.64/10),
- or loopback. Public subnets rejected at the validator.
-- Subnet mask must be /22 or smaller (no /16 or wider). At /22 that's
- ~1024 hosts -- enough for any homelab. Default home network is /24.
-- =--host-timeout 30s --max-retries 1= to bound runtime.
-
-Output: list of =(ip mac hostname state)= tuples.
-
-=:confirm nil= for RFC1918 / link-local / CGNAT / loopback. Public
-subnets never reach this tool (validator rejects).
-
-** Tool 3: =net_services=
-
-Wraps =nmap -sV= for service/version detection on a single host.
-
-Argv:
-
-- =:host= -- required. RFC1918 / link-local / CGNAT / loopback by
- default. Public hosts require =:external t= which flips
- =:confirm t=.
-- =:ports= -- optional port spec. Default: top-100 (=--top-ports
- 100=). Custom lists allowed: ="22,80,443,5432,6379"= or
- ="1-1024"=. Hard cap: 1024 ports total.
-- =:fast= -- if t, uses =--top-ports 20= for a quick check.
-
-Mode allowlist enforced at the wrapper: only =-sV= with optional
-=-p=. Reject =-A=, =-O=, =-T4=/=-T5=, =--script=, raw-packet flags.
-
-Output: list of =(port protocol state service version banner)=
-tuples, parsed from =-oG -= (greppable output).
-
-=:confirm nil= for RFC1918 / link-local / CGNAT / loopback.
-=:confirm t= for any target reachable only as a public IP/hostname.
-
-** Tool 4: =network_status=
-
-Snapshot of the local network state. Composite of:
-
-- =ip -br addr= -- interfaces and their addresses.
-- =ip route= -- routing table.
-- =nmcli -t -f NAME,TYPE,DEVICE,STATE connection show --active= --
- active NetworkManager connections.
-- =ss -tulpn= (or =netstat -tulpn= fallback) -- listening sockets.
-- =resolvectl status= (or =/etc/resolv.conf= fallback) -- DNS
- resolver state.
-
-Output: structured alist with sections for each.
-
-=:confirm nil=. Read-only.
-
-Note: this is also the candidate target for the plan/apply split if
-=nmcli connection up=/=down= ever lands as a tool -- =network_status=
-becomes the "plan" side and any mutation is a separate tool.
-
-** Tool 5: =dns_lookup=
-
-Typed DNS query. Argv:
-
-- =:name= -- required. The DNS name to query.
-- =:type= -- record type. Default =A=. Allowed: =A=, =AAAA=, =MX=,
- =NS=, =TXT=, =SRV=, =CAA=, =CNAME=, =PTR=, =SOA=.
-- =:server= -- optional resolver. Default uses system resolver.
- When set, must be RFC1918 or one of a small allowlist (=1.1.1.1=,
- =8.8.8.8=, =9.9.9.9=) so the tool can't be used to probe arbitrary
- hosts via DNS.
-
-Output: list of records with TTL. For =MX= and =SRV=, includes
-priority/weight/port. For =TXT=, the records are split into the
-quoted segments dig returns.
-
-=:confirm nil=. Read-only.
-
-** Shared helpers
-
-In =gptel-tools/network_tools.el= (single file, mirrors the
-magit-backend plan for git tools):
-
-- =cj/gptel-net--validate-target HOST &optional ALLOW-PUBLIC=
- - Resolves HOST. Rejects unless resolved IP is RFC1918 /
- link-local / CGNAT / loopback, unless ALLOW-PUBLIC is non-nil.
- - Returns the resolved IP on success.
-
-- =cj/gptel-net--validate-subnet CIDR=
- - Rejects non-private subnets and subnets wider than /22.
- - Returns =(network mask)= on success.
-
-- =cj/gptel-net--current-lan=
- - Derives the current /24 from =ip route get 1.1.1.1=.
-
-- =cj/gptel-net--run ARGS &key TIMEOUT=
- - Wraps =process-file= with a uniform timeout, color/encoding
- posture, and structured return =(exit-code stdout stderr)=.
-
-- =cj/gptel-net--parse-nmap-greppable STRING=
- - Parses nmap =-oG -= output into structured tuples.
-
-- =cj/gptel-net--truncate TEXT MAX-BYTES=
- - Same shape as the existing per-tool truncate helpers. Open
- question whether this consolidates into =system-lib.el= alongside
- the matching helpers in =web_fetch.el= and =update_text_file.el=.
-
-** Caps
-
-| Tool | Default cap | Hard cap |
-|------------------+------------------------+------------------------|
-| =net_diagnose= | <30s total runtime | <30s total runtime |
-| =net_discover= | /24 default, /22 max | /22 |
-| =net_services= | top-100 ports | 1024 ports |
-| =network_status= | uncapped (snapshot) | uncapped |
-| =dns_lookup= | uncapped | uncapped |
-
-** =:confirm= posture
-
-| Tool | RFC1918 target | Public target |
-|------------------+-------------------+-------------------------|
-| =net_diagnose= | =:confirm nil= | =:confirm t= |
-| =net_discover= | =:confirm nil= | rejected at validator |
-| =net_services= | =:confirm nil= | =:confirm t= |
-| =network_status= | =:confirm nil= | n/a (local snapshot) |
-| =dns_lookup= | =:confirm nil= | =:confirm nil= |
-
-=dns_lookup= stays =:confirm nil= for public names because DNS is
-read-only and innocuous. =net_diagnose= and =net_services= against
-public targets are gated because pinging/probing public hosts isn't
-*illegal* but it can trip rate-limits or get the user flagged on a
-managed network.
-
-** Tests
-
-Single file =tests/test-gptel-tools-network-tools.el=. Real subnets
-are not available in CI, so:
-
-- =net_discover= and =net_services= are stubbed via =cl-letf= on
- =cj/gptel-net--run=, returning canned nmap output. Real nmap
- invocation tested via one =:tags '(:integration)= test that runs
- =nmap -sn 127.0.0.1/32= and asserts the parser handles the real
- format.
-- =net_diagnose= sub-checks stubbed individually so each failure mode
- can be exercised.
-- =network_status= sections stubbed per-command; one integration test
- runs against the live system and asserts the structure parses.
-- =dns_lookup= stubbed against canned =dig= output; one integration
- test against =localhost= via the system resolver.
-
-Rough count: ~12 shared-helper tests (validators, current-lan
-detector, parsers) + ~7 per tool x 5 tools = ~47 tests.
-
-** Risk surface
-
-| Risk | Mitigation |
-|-----------------------------------------------------------+---------------------------------------------------------------------|
-| nmap scan against an unintended target | Validator gates on resolved IP, not on the input string. Public |
-| | targets require explicit =:external t= flag + =:confirm t=. |
-| Scan triggers IDS/IPS on a corporate/managed network | Default modes are non-aggressive (=-sn=, =-sV= only). No =-A=, no |
-| | NSE, no high T-level. =:confirm t= for non-RFC1918 targets gives |
-| | the user a manual checkpoint. |
-| =net_diagnose= hangs on a slow target | Per-sub-check timeouts; total runtime cap; partial-failure return |
-| | rather than abort. |
-| nmap not installed on the system | =:command= check at module load via =cj/executable-find-or-warn= |
-| | (matching the prettier/pyright pattern documented in CLAUDE.md). |
-| Network tools shell out via =process-file= | argv-list invocation, no shell. =shell-quote-argument= unused |
-| | because no shell is involved. |
-| /tmp pollution or banner output writing to disk | All output captured to buffer via =process-file=, never written. |
-
-* Open questions
-
-1. *Default port set for =net_services=.* Top-100 (nmap default),
- top-1000 (full default scan, slower), or a custom homelab-tuned
- list (=22, 80, 443, 445, 3389, 5432, 6379, 8080, 8443, 9090, 9000,
- 631=)? My read: top-100 default + =:fast t= for top-20 + custom
- override for the homelab list when needed.
-2. *NSE in v1 or deferred?* Skip entirely (clean v1) or ship a small
- allowlist (=ssl-cert=, =http-title=, =ssh-hostkey=)? My read:
- skip in v1. If a real use case shows up (TLS audit), add a single
- =net_tls_audit= tool wrapping just =ssl-enum-ciphers=/=ssl-cert=
- rather than a generic NSE escape hatch.
-3. *Consolidate the truncate helper.* Same open question as the
- magit-backend doc: move =cj/gptel-net--truncate= and its siblings
- into =system-lib.el= as =cj/gptel-tools--truncate-bytes=, or keep
- per-module? My read: consolidate when there are three callers
- (web_fetch, update_text_file, network_tools all qualify).
-4. *Composite vs atomic for =net_diagnose=.* Build it as one
- composite, or break it into =ping_run=, =traceroute_run=,
- =port_check= and let the agent compose? My read: composite is
- better -- the agent reasons in "diagnose-this-target" terms more
- often than in "just-ping-this". Atomic sub-tools can be added
- later if the composite proves coarse-grained.
-5. *Promote plan/apply split to documented convention now?* Or wait
- until a second tool exercises it (post-rsync)? My read: document
- the convention in the Filesystem section body now, since pandoc /
- ffmpeg / imagemagick all benefit, even before any of them ship.
-6. *nmcli mutation tools.* Out of scope for this doc but worth
- flagging: =nmcli connection up <name>= / =nmcli connection down
- <name>= / =nmcli device wifi connect <ssid>=. These would be the
- first apply-side tools under the plan/apply convention, with
- =network_status= as the plan side.
-
-* Effort estimate
-
-M (1-3 hours). Five tools + shared helpers + ~47 tests. Most of the
-time is test authoring (canned nmap output, dig output, ss output);
-production code is small because each tool is a thin =process-file=
-wrapper plus a parser.
-
-* Next steps
-
-- Resolve open questions #1 and #2 before any code lands (the
- =net_services= shape can't be finalized without them).
-- Once approved, the work attaches to =*** TODO [#B] (Network bundle:
- net_diagnose / net_discover / net_services / network_status /
- dns_lookup)= -- a new theme under =*** TODO [#B] (Networking tools
- category)= which itself becomes a new top-level under =** TODO [#B]
- GPTel Tool Work= in =todo.org=, peer to the existing Filesystem
- section.
-- Implementation follows =/start-work= flow: TDD, characterization
- tests for the parsers first (canned nmap/dig/ss fixtures), then
- the wrappers, then the registrations in
- =cj/gptel-local-tool-features=.
-- After landing, revisit candidate #6 (plan/apply split) -- the
- first apply-side tool (=nmcli connection up=, =rsync_apply=,
- pandoc-output) exercises the convention end-to-end.