<feed xmlns='http://www.w3.org/2005/Atom'>
<title>dotemacs/gptel-tools/web_fetch.el, branch load-graph-classify-start</title>
<subtitle>My Emacs configuration
</subtitle>
<id>https://git.cjennings.net/dotemacs/atom?h=load-graph-classify-start</id>
<link rel='self' href='https://git.cjennings.net/dotemacs/atom?h=load-graph-classify-start'/>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/dotemacs/'/>
<updated>2026-05-16T16:30:04+00:00</updated>
<entry>
<title>feat(gptel-tools): harden path validation with file-truename realpath</title>
<updated>2026-05-16T16:30:04+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-16T16:30:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/dotemacs/commit/?id=244d4c56768fcc60bd1b23fe45df7a57c7b293ec'/>
<id>urn:sha1:244d4c56768fcc60bd1b23fe45df7a57c7b293ec</id>
<content type='text'>
Resolves PATH through file-truename before applying home-directory and
read/write checks across the path-handling tools (git_status, git_log,
git_diff, move_to_trash, read_text_file, update_text_file,
write_text_file, list_directory_files, read_buffer, web_fetch).
Without the resolve step, a symlink under HOME pointing outside HOME
would pass the prefix check but the tool would act on the real target
-- a symlink-escape.

move_to_trash also tightens the trash-bin construction (treats empty
file extensions correctly) and switches the "critical directories"
list to truename-resolved canonical forms so a symlinked ~/.config
can't be trashed via an aliased path.

update_text_file fixes an off-by-one in the line-count derivation
when the source content is empty.

Each source change pairs with tests in tests/test-gptel-tools-*.el
and tests/test-update-text-file.el covering the realpath escape
paths, the empty-extension trash case, and the empty-content line-
count edge.  Combined coverage is now 100% across all ten gptel-tools
source files: 516 / 516 executable lines, 217 tests.
</content>
</entry>
<entry>
<title>feat(gptel-tools): wire web_fetch as a local tool</title>
<updated>2026-05-16T10:17:21+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-16T10:17:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/dotemacs/commit/?id=99d93203a867294addf4927ceec5644b9d3bf322'/>
<id>urn:sha1:99d93203a867294addf4927ceec5644b9d3bf322</id>
<content type='text'>
Fourth ADOPT entry from `docs/design/gptel-tools-shortlist.org'.
Lets gptel pull a URL into the conversation so the model can read
docs / current API shapes / etc. without me copy-pasting.

Shape:

- URL must be `http://' or `https://' (file://, ftp://, javascript:,
  scheme-less, etc. are rejected at the validator).
- HTML responses go through `pandoc -f html -t plain' so the model
  gets a reading shape that isn't full of markup; falls back to
  `w3m -dump -T text/html' if pandoc isn't on PATH; signals
  `user-error' if neither is.  Pass `raw=t' to skip stripping.
- Output capped at 200KB by default, hard cap 1MB; `max_bytes'
  argument lets the caller pick a lower cap.  Truncation reported
  inline.
- 4xx / 5xx response codes signal `error' with the code -- the
  alternative is returning an error page body, which the model
  would treat as content.

`:confirm t' on the tool because every call is a real outbound
network request.  The tool's description warns that URLs go
wherever the user-agent points, including internal networks if
that's what the URL names.

`tests/test-gptel-tools-web-fetch.el' -- 20 tests across Normal /
Boundary / Error.  URL validator covers http / https / non-string
/ empty / non-http schemes.  `--effective-max-bytes' covers default
/ low-clamp / hard-cap / passthrough.  Truncate helper covers
under-cap / at-cap / over-cap with the marker.  HTML stripper runs
against real pandoc / w3m (both installed in dev env, neither
should mangle simple markup).  Orchestrator stubs
`cj/gptel-web-fetch--retrieve' via `cl-letf' to cover normal /
raw / 4xx / 5xx / oversize / bad-scheme paths.

Wired into `cj/gptel-local-tool-features' so gptel exposes the
tool on next restart.

Note: `call-process-region' invocation flattened to a single
`with-temp-buffer' with DELETE=t -- the initial draft nested a
second temp buffer and routed output to the inner one, which got
killed before `buffer-string' on the outer ran.  Test caught it.
</content>
</entry>
</feed>
