aboutsummaryrefslogtreecommitdiff
path: root/scripts
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-02-03 08:13:01 -0600
committerCraig Jennings <c@cjennings.net>2026-02-03 08:13:01 -0600
commit552410e64aa3c3cdef3e9485d782a722283f6b45 (patch)
tree6e069eed5cf76ea43901593fe3e1b895b18fc1ad /scripts
parent431db43523604631bee3e72c6e53f5c752053ce2 (diff)
downloaddotemacs-552410e64aa3c3cdef3e9485d782a722283f6b45.tar.gz
dotemacs-552410e64aa3c3cdef3e9485d782a722283f6b45.zip
perf(lorem-optimum): fix O(n²) tokenization algorithm
The tokenizer was creating substring copies on every iteration: - (substring text pos (1+ pos)) for whitespace check - (substring text pos) for regex matching - copies ALL remaining text This caused 10K word tokenization to take 727ms instead of 6ms. Fix: Use string-match with start position parameter and check characters directly with aref instead of creating substrings. Performance improvement: - Tokenize 10K words: 727ms → 6ms (120x faster) - Learn 10K words: 873ms → 15ms (59x faster) - Learn 100K words: 70s → 208ms (341x faster)
Diffstat (limited to 'scripts')
0 files changed, 0 insertions, 0 deletions