The Emacs regexp implementation, like many of its kind, is generally robust but occasionally causes trouble in either of two ways: matching may run out of internal stack space and signal an error, and it can take a long time to complete. The advice below will make these symptoms less likely and help alleviate problems that do arise.
\`). This takes advantage of fast paths in the implementation and can avoid futile matching attempts. Other zero-width assertions may also bring benefits by causing a match to fail early. (It is a trade-off: successfully matched or-patterns run faster with the most frequently matched pattern first.)
Be especially careful with nested repetitions: they can easily result in very slow matching in the presence of ambiguities. For example, ‘\(?:a*b*\)+c’ will take a long time attempting to match even a moderately long string of ‘a’s before failing. The equivalent ‘\(?:a\|b\)*c’ is much faster, and ‘[ab]*c’ better still.
rx (see Rx Notation); it can optimize some or-patterns automatically and will never introduce capturing groups unless explicitly requested. If you run into regexp stack overflow despite following the above advice, don’t be afraid of performing the matching in multiple function calls, each using a simpler regexp where backtracking can more easily be contained.
Copyright © 1990-1996, 1998-2022 Free Software Foundation, Inc.
Licensed under the GNU GPL license.
https://www.gnu.org/software/emacs/manual/html_node/elisp/Regexp-Problems.html