summaryrefslogtreecommitdiff
path: root/devdocs/elisp/char-classes.html
blob: 3632eac1b6543bdd36bffd52740ae7debe9c18d0 (plain)
1
2
3
4
5
6
 <h4 class="subsubsection">Character Classes</h4>      <p>Below is a table of the classes you can use in a character alternative, and what they mean. Note that the ‘<samp>[</samp>’ and ‘<samp>]</samp>’ characters that enclose the class name are part of the name, so a regular expression using these classes needs one more pair of brackets. For example, a regular expression matching a sequence of one or more letters and digits would be ‘<samp>[[:alnum:]]+</samp>’, not ‘<samp>[:alnum:]+</samp>’. </p> <dl compact> <dt>‘<samp>[:ascii:]</samp>’</dt> <dd><p>This matches any <acronym>ASCII</acronym> character (codes 0–127). </p></dd> <dt>‘<samp>[:alnum:]</samp>’</dt> <dd><p>This matches any letter or digit. For multibyte characters, it matches characters whose Unicode ‘<samp>general-category</samp>’ property (see <a href="character-properties">Character Properties</a>) indicates they are alphabetic or decimal number characters. </p></dd> <dt>‘<samp>[:alpha:]</samp>’</dt> <dd><p>This matches any letter. For multibyte characters, it matches characters whose Unicode ‘<samp>general-category</samp>’ property (see <a href="character-properties">Character Properties</a>) indicates they are alphabetic characters. </p></dd> <dt>‘<samp>[:blank:]</samp>’</dt> <dd><p>This matches horizontal whitespace, as defined by Annex C of the Unicode Technical Standard #18. In particular, it matches spaces, tabs, and other characters whose Unicode ‘<samp>general-category</samp>’ property (see <a href="character-properties">Character Properties</a>) indicates they are spacing separators. </p></dd> <dt>‘<samp>[:cntrl:]</samp>’</dt> <dd><p>This matches any character whose code is in the range 0–31. </p></dd> <dt>‘<samp>[:digit:]</samp>’</dt> <dd><p>This matches ‘<samp>0</samp>’ through ‘<samp>9</samp>’. Thus, ‘<samp>[-+[:digit:]]</samp>’ matches any digit, as well as ‘<samp>+</samp>’ and ‘<samp>-</samp>’. </p></dd> <dt>‘<samp>[:graph:]</samp>’</dt> <dd><p>This matches graphic characters—everything except whitespace, <acronym>ASCII</acronym> and non-<acronym>ASCII</acronym> control characters, surrogates, and codepoints unassigned by Unicode, as indicated by the Unicode ‘<samp>general-category</samp>’ property (see <a href="character-properties">Character Properties</a>). </p></dd> <dt>‘<samp>[:lower:]</samp>’</dt> <dd><p>This matches any lower-case letter, as determined by the current case table (see <a href="case-tables">Case Tables</a>). If <code>case-fold-search</code> is non-<code>nil</code>, this also matches any upper-case letter. </p></dd> <dt>‘<samp>[:multibyte:]</samp>’</dt> <dd><p>This matches any multibyte character (see <a href="text-representations">Text Representations</a>). </p></dd> <dt>‘<samp>[:nonascii:]</samp>’</dt> <dd><p>This matches any non-<acronym>ASCII</acronym> character. </p></dd> <dt>‘<samp>[:print:]</samp>’</dt> <dd><p>This matches any printing character—either whitespace, or a graphic character matched by ‘<samp>[:graph:]</samp>’. </p></dd> <dt>‘<samp>[:punct:]</samp>’</dt> <dd><p>This matches any punctuation character. (At present, for multibyte characters, it matches anything that has non-word syntax.) </p></dd> <dt>‘<samp>[:space:]</samp>’</dt> <dd><p>This matches any character that has whitespace syntax (see <a href="syntax-class-table">Syntax Class Table</a>). </p></dd> <dt>‘<samp>[:unibyte:]</samp>’</dt> <dd><p>This matches any unibyte character (see <a href="text-representations">Text Representations</a>). </p></dd> <dt>‘<samp>[:upper:]</samp>’</dt> <dd><p>This matches any upper-case letter, as determined by the current case table (see <a href="case-tables">Case Tables</a>). If <code>case-fold-search</code> is non-<code>nil</code>, this also matches any lower-case letter. </p></dd> <dt>‘<samp>[:word:]</samp>’</dt> <dd><p>This matches any character that has word syntax (see <a href="syntax-class-table">Syntax Class Table</a>). </p></dd> <dt>‘<samp>[:xdigit:]</samp>’</dt> <dd><p>This matches the hexadecimal digits: ‘<samp>0</samp>’ through ‘<samp>9</samp>’, ‘<samp>a</samp>’ through ‘<samp>f</samp>’ and ‘<samp>A</samp>’ through ‘<samp>F</samp>’. </p></dd> </dl><div class="_attribution">
  <p class="_attribution-p">
    Copyright &copy; 1990-1996, 1998-2022 Free Software Foundation, Inc. <br>Licensed under the GNU GPL license.<br>
    <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html" class="_attribution-link">https://www.gnu.org/software/emacs/manual/html_node/elisp/Char-Classes.html</a>
  </p>
</div>